Journal of 


‘Experimental Psychology 


ARTHUR W. MELTON, Editor 
Am Foace Personne. anp Traininc Reszaancu Center 


Asm Force Bass 
San Awromo, Texas 
CONSULTING EDITORS 


Neal E. Miller, Yale University 

Kenneth W. Spence, State University of lowe 
Benton J. Underwood, Northwestern University 
Delos D. Wickens, Obie State University 


Lorraine Bouthilet, Managing Editor 





Stereopsis Produced Without Horizontally Disparate Stimulus Loci: P. C. Squimes 
Vigilance in the Detection of Low-Intensity Visual Stimuli: J. A. Apams 





American Psychological Association 
Vol. 52 No. 3 3 September 1956 





JOURNAL OF EXPERIMENTAL PSYCHOLOGY 


iia ole > a oy ae ig en a 

ry Pace yee f pela og Pomnbney The 

is $8.00, or $16.00 annually. Single copies are $1.50. 

address changes, and business communications should be the 
Association, Inc., 1333 Sixteenth St. N.W., Wash- 


American 
ington 6, D. C. 
This JournnaL 


Articles are in the order of their in rare circum- 
oe, SA 
cost of an 8 to the author. 
wats whe thar anime the a 
cost per page. 

free offprints. 


drawn 





tor "ia ‘pornaraph (2:2). 





Journal of 


Experimental Psychology 








VoL. 52, No. 3 


1956 


SEPTEMBER, 














THE DEPENDENCE OF INTERRESPONSE TIMES UPON THE 
RELATIVE REINFORCEMENT OF DIFFERENT 
INTERRESPONSE TIMES! 


DOUGLAS ANGER 


Harvard University * 


The literature contains indications 
that the time intervals between re- 
sponses, i.c., the interresponse times 
([RTs), made by rats depend on 
which IRTs have been followed more 
often by food reinforcements. In the 
present study evidence on the occur- 
rence and nature of this dependence 
was obtained by comparing IRT 
distributions with reinforced-IRT 
distributions, and by altering the 
allocation of reinforcements to IRTs 
(reinforcement of an IRT refers to 
reinforcement of the terminal response 
of that IRT). 

The first indication that the IRT 
distribution is affected by the relative 
reinforcement of IRTs of different 
lengths came from the observation of 
Skinner (4) that fixed interval rein- 


This article is based on a thesis submitted in 
partial fulfillment of the requirements for the 
Ph.D. degree at Harvard University, 1955. 
The investigation was supported at first by the 
Society of Fellows of Harvard University and 
later by the Medical Research and Development 
Board, Office of the Surgeon General, Depart- 
ment of the Army, under contract No. DA-49- 
007-MD-408. The advice and encouragement 
of Drs. B. F. Skinner and Edwin B. Newman are 
gratefully acknowledged. 

?Now at the Pharmacology Dept., Upjohn 
Co., Kalamazoo, Mich. 


forcement (FI)* produces a_ con- 
siderably lower rate of response with 
rats than does fixed-ratio reinforce- 
ment (FR). The lower rate of FI 
was found both from comparisons of 
FI and FR which produce the same 
number of responses per reinforce- 
ment (4, p. 284 ff.), and from com- 
parisons of FI and FR which produce 
the same number of reinforcements 
per hour (4, calculated from Fig. 98 
and Table 2). Skinner pointed out 
that FI favors the reinforcement of 
long IRTs more than FR, 
Hence the greater reinforcement of 
slow responding by FI may be re- 
sponsible for the lower response rate 
during FI. 

Another experiment described by 
Skinner (4, p. 306) indicates that the 


does 


* Interval reinforcement refers to a schedule 
that reinforces the first response of the chosen 
class after certain events specified by a timer 
In fixed-interval (Fl) these 
events are separated by fixed time-intervals; in 
variable-interval reinforcement (VI) the inter 
vals are variable. 


reinforcement 


Ratio reinforcement refers to 
a schedule that reinforces a response only after a 
certain number of unreinforced responses have 
The 
number may be fixed or variable giving rise to 
fixed-ratio (FR) and (VR) 
reinforcement. 


occurred since the last reinforced response. 


variable-ratio 





146 


response rate of rats is lowered by 
reinforcing only IRTs greater than 
15 sec. Wilson and Keller (6) also 
have reported that the selective 
reinforcement of long IRTs decreases 
the response rate of rats, though the 
interpretation of their results is com- 
plicated by a decrease in the rate of 
reinforcement along with the selective 
reinforcement. 

Mueller’s study (3) of the IRT 
distribution during 3-min. FI indi- 
cated that there is no definite differ- 
ence in the response probability at 
different times after the last response. 
This evidence contrasts with the 
indication presented earlier that FI 
may produce differences in the re- 
sponding at different times after the 
last response. Mueller does not men- 
tion the duration of FI before the 
session analyzed, so perhaps the 
duration was insufficient for the Ss to 
react to the FI contingencies. 

The present study has sought to 
determine, under several conditions, 
the extent of the control over IRTs 
exerted by the differential reinforce- 
ment of IRTs of different length. 
For this purpose the IRT distributions 
were determined and comparisons 
were made with the theoretical and 
observed distributions of reinforced 
IRTs. Comparisons were made soon 
after conditioning, after extended 
exposure to variable-interval rein- 
forcement (VI), and after a large 
experimentally produced change in 
the relative reinforcement of different 
IRTs. A VI schedule was used 
because long exposure to FI results in 
a discrimination of the reinforcement 
period (4) which would have compli- 
cated the interpretation of results. 


Metuop 
Apparatus 


Animal cages.--Five Skinner boxes of iden- 
tical design were used. ‘The cage holding S was 


DOUGLAS ANGER 


124 in. long, 8 in. wide, 84 in. high. The ceiling, 
the two long walls, and the floor were #,-in. 
stainless-steel rods centered 4 in. apart. The 
end walls were sheet stainless steel. Sections of 
the rods forming the floor (two rows of four 
sections) were supported separately, and were so 
hinged and counterweighted that when more 
than 43 gm. of S’s weight was on a section, it 
moved down about yy in. and operated a micro- 
switch. This arrangement signaled the activity 
and position of S, but played no role in this 
experiment. 

The bar was yx in. thick, and 27, in. wide, but 
tapered to 1 in. wide at the rounded front edge. 
It was made from thin sheet stainless steel. The 
front edge could be pressed down 4 in., and a 
microswitch operated near the middle of this 
excursion. A magnet and the weight of the bar 
were so adjusted that a downward force of 15-gm. 
weight on the front edge was just sufficient to 
move the bar, but after the bar had been moved 
ys in., 8-gm. weight completed the depression. 
Hence any movement of the bar by S almost 
always produced a complete depression and 
switch operation. The bar mounting prevented 
bouncing of the bar after a hard blow, and thus 
prevented an error in the IRTs recorded. 

A small motor could move the bar either out 
of reach of S or to a position where the bar 
projected 4 in. into the cage through a hole in 
the end wall. Movement into the cage was 
slow and gentle. Removal was rapid and 
forceful, but the smooth surface of the bar and 
the design of the cage hole prevented any injury 
or disturbance of S. 

To the right of the bar was a hole about 2 in. 
square covered by a light stainless-steel door. 
The door was hinged at its upper edge, and Ss 
pushed it open easily to reach a small food 
chamber. The reinforcement pellets were drop- 
ped into this chamber by a rotary magazine, and 
once a day a trap-door feeder dropped in several 
grams of the same pellets. A water bottle was 
always present in the corner diagonally opposite 
the food chamber. 

Each apparatus was enclosed in a box, which 
together with fan noises in the room so reduced 
the audibility in each box of sounds from other 
boxes that the human ear could not detect them. 
Sounds from the electrical control circuit, located 
in a separate room, were considerably below the 
human threshold in the room containing the 
boxes, and the boxes added further attenuation 
The cages were illuminated through a diffuser 
in the box roof. The illumination on the floor 
of the cage was about .1 ft.-candle. A window 
in each box permitted observation of S. Air was 
continuously introduced into the boxes by a 
blower. 





REINFORCEMENT OF INTERRESPONSE TIMES 


The temperature was held to 71+1.5° F. and 
the relative humidity between 52% and 59% 
except for several slight departures. 

Electrical circuit.—Separate counters were 
used to count IRTs of different length. The 
group of IRT lengths counted by one counter 
will be called a band, and the 4-8 sec. band refers 
to IRTs between 4 and 8 sec. long. IRT's from 
4 to 72 sec. long were counted in 17 bands with 
the band width (range of IRTs counted by one 
counter) 4 sec. for each band. The > 72-sec. 
and >4sec. IRTs were also counted; the 
> 4-sec. IRTs provided a check on the sum of 
the other counters. Subtraction of the > 4-sec. 
IRTs from the total IRTs (responses minus one) 
gave the0-4sec.IRTs. Another set of counters 
recorded reinforced IRTs in the same bands. 
The circuit could also limit reinforcements to 
IRTs in chosen bands. The IRT measured was 
the time between two successive downward 
movements of the bar. 

The following technique was used for this 
IRT analysis. A timer produced short pulses 
every .400 sec., and every tenth pulse from this 
timer operated a stepping switch. When a 
response occurred, the stepping switch did not 
move for .120 sec. to allow a .080-sec, pulse to 
operate a counter, but which counter received 
the pulse depended on the position to which the 
stepping switch had advanced. Following the 
.120-sec. period, the stepping switch reset to its 
starting position. During the resetting period 
the timer pulses were temporarily counted by a 
relay circuit, so the stepping switch took over 
only after ample time for resetting. 

With this technique an IRT may start at any 
time during the .400 sec. between two timer 
pulses and still produce the same counter record. 
The consequences of this fact were treated as 
follows. The boundary of a band was defined 
as the IRT length at which 50% of the IRTs 
registered in that band and 50% registered in the 
neighboring band. From analysis of the circuit 
properties, and from checks on its performance, 
it was found that all bands except the 0-4 sec. 
band were close to 4.000 sec. wide in the long 
run.‘ However, two errors did result from the 
technique described and other circuit properties ; 


4 There is a slight error due to the asymmetry 
of the IRT distribution. Calculations showed 
this error, the number of IRT's gained by a band 
expressed as a percentage of the IRTs in the 
band, to be .2% when the probability of response 
in each band, given an opportunity, is .9. The 
error is less at lower probabilities (.01% when 
the probability is .4). The error is increased by 
inequality of the response probability in different 
bands, but it remains negligible for all observed 
distributions. 


147 


(a) The 0-4 sec. band was actually 4.16 sec. wide. 
Since the discrepancy between this width and 
that of other bands was but 4%, it was neglected. 
For brevity the labels “0-4 sec.,” etc. will be 
used rather than the more accurate ones, “O-4.16 
sec.,” etc. The “O-4 sec.” label also neglects 
the inability of the apparatus to record < .04- 
sec. IRTs. (b) An additional sampling error 
was introduced for all bands. In the long run 
about as many responses were gained at band 
boundaries as were lost, but samples were sub- 
ject to chance variation in the number gained 
and lost. This boundary variance is consid- 
erably smaller than the sampling variance 
resulting from a constant response probability 
in the band on each opportunity, except at high 
probabilities. Above a response probability of 
.9 the boundary variance becomes the greater. 
Thus for most of the measurements the variation 
due to this technique was less than an inherent 
sampling variability. 

The counter records were checked occasion- 
ally with a Standard Electric timer (.01-sec. 
divisions). ‘The observed differences were con- 
sistent with the errors specified above. 

Reinforcements were programmed with 
punched tapes moved by timers. For each Sa 
tape assigned reinforcements to a storage circuit. 
During VI an assignment resulted in pellet 
delivery at the next response “start.” The 
storage circuit could hold up to 10 assignments, 
so if a tape assigned a second reinforcement 
before delivery of the first, the second was not 
lost, but was delivered after the second response 
(only one pellet to a response). Storage was 
unimportant with VI where the response rate was 
usually high enough for delivery of an assign 
ment before another occurred, but storage was 
important during reinforcement of only long 
IRTs where S failed for long periods to make 
IRTs long enough for reinforcement. Pellets 
were always delivered at downward movements 
of the bar. 


Subjects 


Five experimentally naive male albino rats 
of the Wistar strain were Ss and will be identified 
as S;, Ss, Ss, Sg, and Ss. All were 228 + 12 
days old on the day of conditioning. 


Procedure 


Four Ss (S;, Se, Ss, Sq) received the same 
treatments (No. 1 to 6 below); S, supplements 
the data from Treatment 1. By restricted 
feeding Ss were reduced to about 25% below 
their initial ad lib. weight. They were then 
moved into the cages described and thereafter 
each S spent 22 hr. a day in the same cage (2 hr. 





148 


were spent in another cage during feeder loading, 
etc.). After conditioning, bar-pressing periods 
occurred every day. All treatment changes 
occurred between successive days. 

Feeding and habituation.—In the experimental 
cages Ss were fed a weighed amount of food daily 
at a regular time, the feeding period. Both 
reinforcements and feeding-period food consisted 
of .046-gm. pellets made from Purina Lab Chow 
by the standard procedure of the P. J. Noyes Co. 
Before conditioning 30 days were allowed for 
adjustment of the daily feeding to bring S’s 
weight closer to 25% below its initial weight. 

After conditioning, the total amount of food 
fed each S daily was not changed. ‘The amount 
was one that kept S at about 25% below his 
initial weight, and ranged from 10.7 to 12.8 gm. 
for different Ss. An exception was made for S, 
after 37 days of bar pressing. Because S; began 
losing weight and appeared unwell, his feeding 
was raised 6gm. When his weight had returned 
to its former value 43 days later, his feeding was 
reduced .2 gm. and thereafter held constant. 
The weight range of each of the other four Ss was 
less than 22 gm. (< 6%of ad lib. weight). Most 
of this range resulted from a very slow drift. 
The range of body weight for S,; was 41 gm. 
(12%), though when the disturbance described 
above is excepted, the range is like the other Ss. 
The mean weight of each S was between 23% 
and 27% below the initial ad lib. weight. 

Magazine training (6 days).—Between 84 and 
180 pellets were used to train S to go to the food 
chamber and eat after the loud magazine noise. 
This training began six days before bar con- 
ditioning. ‘The first few pellets were delivered 
when S’s head was in the food chamber, hence 
they were eaten soon after delivery. The other 
pellets were delivered when S’s head was not in 
the food chamber. The pellets were given at 
irregular intervals, and consistent delivery after 
any particular behavior was avoided, ‘To mini- 
mize the conditioning of other behavior, pellets 
were stopped when S consistently reached them 
within 2 sec. after delivery. 

Treatment 1: Conditioning of bar pressing 
and 5-min. VI (38 days).—On the day of con- 
ditioning, after delivery of about 10 magazine- 
training pellets, the bar was introduced for the 
first time. The first bar depression was rein- 
forced, as were the next 15 depressions that 
followed pellet eating. (If S pressed the bar 
again before eating a delivered pellet, a second 
pellet was not given.) After about five rein- 
forcements on a l-min. VI schedule, 5-min. VI 
began and continued for the rest of Treatment 1. 
The daily bar session, the time the bar projected 
into the cage, was about 90 min. on the day of 
conditioning. For the next 25 days the bar 


DOUGLAS ANGER 


session was 2 hr. long and ended 3 hr. before the 
feeding period. During the last 12 days of 
Treatment 1, and during Treatments 2, 3, and 4, 
it was shortened to 1 hr. and ended 4 hr. before 
the feeding period. The 5-min. VI schedule 
assigned reinforcements with the 
spacings (in seconds) and sequence: 6, 5U, 170, 
770, 530, 24, 290, 410, 100, 650. Each day the 
sequence began where it left off the day before. 

Treatment 2: Reinforcement of only > 40-sec. 
IRTs on a 5-min. VI schedule (48 days).—This 
treatment was the same as Treatment | except 
that after reinforcement assignment the next 
bar press was not necessarily reinforced, but the 
next bar press terminating a > 40-sec. IR'T was. 

Treatment 3: Reinforcement of only > 40-sec. 
IRTs on a2.5-min. VI schedule (31 days).—Only 
the schedule assigning reinforcements was 
changed. ‘The new reinforcement spacings (in 
seconds) and sequence were 6, 45, 135, 390, 240, 
24, 180, 70, 100, 310. 

Treatment 4: Return to 5-min. VI with no 
other restriction on IRTs reinforced (30 days). 
Same as Treatment | above. 

Treatment 5: Continuation of Treatment 4 but 
with four I-hr. bar sessions daily (24 days).—For 
9 days there were four l-hr. bar sessions, 3 hr. 
apart, and the last session ended 3 hr. before the 
feeding period. For the next 15 days there were 
four I-hr. bar sessions, 2 hr. apart, and the last 
session ended 14 hr. before the feeding period. 

Treatment 6: Reinforcement of only < 28-sec. 
IRTs on a 5-min. VI schedule (13 days).—The 
four I-hr. bar sessions of Treatment 5 were 
continued. After a reinforcement assignment, 
the next response start which terminated a < 28- 
sec. IRT was reinforced. 

Treatment of Ss.—-The data from S, supple- 
ment the Treatment-l data of other Ss. The 
same preliminaries and Treatment | were given 
to S; except a 2-hr. bar session was used through- 
out, and 33 days of Treatment | were followed 


by 25 days of 2.5-min. VI. 


following 


Resu.tts AND Discussion 


Treatment 1: 


Conditioning and 5- 
Min. VI (38 Days) 


Figures 1A and 1B show for S, the 
relative frequency of different IRTs 
at the start and end of Treatment 2. 
Certain considerations suggest an- 
other treatment of these data. Non- 
response for at least as long as the 
shortest IRT included in a band is 
necessary for an IRT to occur in that 
band. Since this period of non- 





REINFORCEMENT OF INTERRESPONSE TIMES 


response is a necessary, though not 
sufficient, condition for the IRT, it 
may be said to provide an opportunity 
for the IRT. In a sample of re- 
sponding the number of opportunities 
for response in a given band is the 
number of IRTs in that band plus 
the number of longer IRTs. Conse- 
quently S has more opportunities for 
short IRTs than for long IRTs. An 
opportunity for a 0-4 sec. IRT occurs 
during every IRT, an opportunity for 
an 8-12 sec. IRT occurs during all 
IRTs except 0-4 and 4-8 sec. IRTs, 
etc. This difference in the number 
of opportunities may influence the 
relative frequency of response in 
different bands. But the different 
bands can be equated with respect to 
opportunities by calculating the num- 
ber of IRTs in a band per opportunity 
in that band (IRTs/Op.). The 
IRTs/Op. is a statistic which esti- 
mates the probability of response 
between x and y sec. after the last 
response, given that S reaches x sec. 
after the last response.® Figures 1C 
and 1D present the data of Fig. 1A 
and 1B show the 
IRTs/Op. 


recalculated to 


A priori there seems to be little basis for 
preference between these two variables, number 
of IRTs and IRTs/Op. 


them we need to know whether the responding 


To choose between 
is influenced by the difference in the number of 


5 It can be shown from the binomial distri 
bution that the IRTs/Op. is an unbiesed esti 
mate of this probability. 
several samples (total IRT's in the band divided 
by the total opportunities for those IRTs) is 
In this report all IRTs/Op. 
means are pooled except the means shown in 
Fig. 2. 
inhomogeneity between Ss would have greatly 


The pooled mean of 


also unbiased. 


These are not pooled since the glaring 


distorted the pooled results as will be explained 
later. The means in Fig. 2 are averages of the 
IRTs/Op. values for each S 
biased, but 
demonstrate that Ss showed little consistency 
in the differences among the first four bands soon 


after conditioning. 


This statistic is 


these means are only given to 


149 


FIRST 4 DAYS OF TREATMENT | 
(CONDITIONING AND S-MiN VARIABLE -INTER VA REINFORCEMENT) 


|" WTERng prone i") 
i ane c | 
pe jb 4}4-0)- sp 
so at gue le00 800) 108 hd 
‘he. OPPORTUNITIES « - 

P| meet orn onenem wry | OMS” O80) S08 | 172 
tlhe Gomer mio 99 fa Lae Laat 


4 


C 


PEL ATIVE FREQUENCY 








s*nnee 
LAST 10 DAYS OF TREATMENT I 
(2 THROU Ww Dar OF * £ WTC, HERP ORCEMENT) 


, 


wim hae 


ews 2 


RELATIVE FREQUENCY 
3 


= 


& 


° 





HH 1-1-1 


a 
a . ~» 
INTERRESPONSE Time INTERRESPONSE Time 
(RT) i SEC 


Ont) SEC 


Fic. 1. A and B show the relative frequency 
of the various IRTs made by S; 
daily values where space allows 


Dots give the 
Solid 


and D show the 


lines 
connect the pooled means. C 
IRTs/Op. for 


daily 


the same data 
solid 
means; but both are only shown for bands with 
at least 20 opportunities each day. Above C the 
calculation of IRTs/Op. is illustrated with the 
data of Aand C 


Dots give the 


values; lines connect the pooled 


opportunities for the different IRTs, or whether 
the lengths of the IRTs are determined at the 
time of the initial response and are not influenced 
by subsequent opportunities during the IRTs 
Figure 1A shows that during the first 4 days of 
5-min. VI the number of short IRTs greatly 
exceeded the number of long IRTs. In con 
trast, Fig. 1C shows that during the same period 
the IRTs/Op. variable was nearly the same for 
different IRTs and hence 
Mueller’s data (3), that the probability of 
response was independent of the time since the 
last response. 


indicates, as did 


These results suggest that S's 
initial responding is determined by an equal or 
nearly equal probability of response on each 
opportunity, irrespective of the time since the 
last response. If so, S is influenced by the 
difference in the number of opportunities for 





DOUGLAS ANGER 


5 











AAddd bb bdsdddssddads 


wUSeVervevesveeles) 
6 20 5 © 6 2 1 5 


dAddddd dd dda 








o 6 2 5 6 2 


OMLY BAR SESSIONS BEGINNING WITH THE Day OF CONDITIONING 


Fic. 2. 


sented by a different line. 


for better visibility of other lines. 


different IRTs, and the declining relative fre 
quency curve probably results from the decline 
in opportunities with increasing IRT length. 
The independence of the preceding response 
disappeared by the last 10 days of 5-min. VI 
(Fig. 1D). By then the 0-4 sec. IRTs/Op. were 
significantly (P < .01, Wilcoxon’s 7° test, 5) 
greater than the IRTs/Op. in each of the other 
bands shown.* 


Figures 2 and 3 show that all five 
Ss displayed this initial independence 
of the last response, and then loss of 
this independence with exposure to 
5-min. VI. Each of the first five 
bands is represented by a different 
line showing the IRTs/Op. for that 
band on successive days. With S, 
and S, the IRTs/Op. in different 
bands separate widely and rapidly, 
but Sy, Ss, and S, show slower sepa- 


®When the IRTs/Op. are the same in different 
bands, the relative frequency curve is a waiting 
time distribution, a geometric distribution which 
is the discrete version of the exponential distri- 
bution (1, p. 218 ff.). However, the IRTs/Op. 
are only the same in different bands for a brief 
time, so a development in more general terms 
than the usual waiting times was necessary. 


The changes in the IRTs/Op. after conditioning. 


Each of the first five bands is repre 


The lines only connect IRTs/Op. values based on at least 20 opportunities. 
The data from the two halves of the first session are plotted separately for S,, Se, S«. 
line for the 16-20 sec. IRTs/Op., which were quite close to the 12-16 sec 


With S; the 


IRTs/Op., was omitted 


ration and not as great differences. 
Separate analyses of variance using 
ranked data (2) for each S during the 
last 8 days of Fig. 2 showed that S,, 
Ss, S;, and S, had significant (P 
< 001) differences among bands. 
All four Ss had significant differences 
between the 0-4 and 8~12 sec. bands, 
and between the 4-8 and 12-16 sec. 
bands (P < .01, Wilcoxon’s 7 test, 
except that the latter comparison for 
S, only reached the .02 level). Un- 
like other Ss, Ss showed little differ- 
ence between the IRTs/Op. in the 
0-4, 4-8, and 8-12 sec. bands, even 
after 30 days of 5-min. VI. 


However, 
change to 2.5-min. VI rapidly pro- 
duced a wide separation between these 
bands with the 0-4 sec. IRTs/Op. 


well above the others (Fig. 3). Al- 
though the sharp maximum at 0-4 
sec. did not develop while on 5-min. 
VI, S, did show a broad maximum at 
short IRTs then. Analysis of vari- 
ance of ranked data (2) for the last 
20 days of 5-min. VI showed signifi- 





REINFORCEMENT OF INTERRESPONSE TIMES 


cant (P < .0O1) differences among 
bands. The 0-4, 4-8, and. 8-12 sec. 
IRTs/Op. were each significantly 
(P < 01, Wilcoxon’s 7 test) greater 
than both the 12-16 and 16-20 sec. 
IRTs/Op. Thus all five Ss start out 
with the IRTs/Op. in most bands 
close together, and by the end of the 
period shown all have the IRTs/Op. 
significantly higher at short IRTs, 
though Ss differed in the height and 
breadth of the peak. 


The indiscriminate response found at 
the start of bar pressing is appropriate 
to the first contact with new stimuli, and 
seems to determine the form of the IRT 
distribution obtained then. Hence the 
steep decline in the relative frequency 
curves is probably due to the decrease in 


opportunities with increasing IRT 
length, and the IRTs/Op. variable 
appears more useful than the relative 


frequency because the IRTs/Op. cor- 
rects for the change in opportunities with 
IRT length. This choice is also sup- 
ported by the fact that continued ex 
posure to VI results in a conspicuous 
change in the IRTs/Op. curve, whereas 
the change in the relative frequency 


SUBJECT 5 


pS 








° 


5 o 86s nohmUcelmUC CDC SlULhOULCUCUDM 
OALY BAR SESSIONS BEGINING WITH DAY OF CONDITIONING 


Fic. 3. 
after conditioning 
different S. 


The changes in the IRTs/Op. of S; 
The same as Fig. 2 but a 


151 


curve is slight (Fig. 1). Thus the IRTs/ 
Op. curve displays better what S is doing, 
since it shows that at first the responding 
has an independence of the time since 
the last response which is appropriate to 
the novelty of the situation, and also 
shows that longer exposure to the sched- 
ule changes the behavior, and develops 
consistent differences in the responding 
at different times after the last response. 


Difficulties with the IRTs/Op. variable.—(a) 
With increasing IRT length, there is a decrease 
in the sample size (opportunities) used to com- 
pute the IRTs/Op. Hence the reliability de 
creases with increasing IRT. This property 
should be remembered during examination of 
IRTs/Op. curves, the relation between IR'Ts/ 
Op. and IRT length. ‘This is one reason for 
limiting the IRTs/Op. values shown in figures 
to those bands where at least 20 opportunities 
per day occurred. 

(b) A change in the IRTs/Op. in one band 
changes the opportunities per hour in other 
bands. An IRTs/Op. increase in a band 
reduces the opportunities for longer IRTs and 
increases them for shorter. 

(c) The IRTs/Op. curve for an aggregate of 
two different kinds of responding tends to have 
a different shape than the IRTs/Op. curve of 
either responding alone. The term inhomo- 
geneity will be used for such aggregates when it 
is highly improbable that the differences result 
from sampling variation. The major inhomo 
geneity observed was the combination of a high 
response rate in the first portion of the bar 
session with a much lower rate in the last portion 
Such inhomogeneity will produce a peak in the 
IRTs/Op. curve at short IRTs even though both 
the high and low rates, by themselves, have flat 
IRTs/Op (The high rate 
contributes far more short IRTs and oppor 
tunities for short IR'T's than the low rate, so the 
composite curve at short IRTs is determined 
primarily by the high IRTs/Op. of the high 
rate. In a similar fashion the composite curve 
at long IRTs is determined prir 
low IRTs/Op. of the low rate.) Obviously when 
we are concerned with the shape of the IRTs 
Op. curve there should be no important in 


curves response 


arily by the 


homogeneity in the sample else a curve like 
neither component may result. Between days 
21 and 26 of 5-min. VI there were days when the 
responding of S2 and Ss dropped enough during 
the last half of the bar session to seriously 
influence the IRTs/Op. curve. The cause of 
this decline is not clear, but it probably was not 





DOUGLAS ANGER 


nN 
° 


e 


3 


.-] 
° 


'\/20 FIXED OR VARIABLE RATIO 


(THEORETICAL REINFG/IRT ) 


4 4 4 4 
10 20 w 40 
INTERRESPONSE TIME (IRT) IN SEC. 





PROBABILITY OF REINFORCEMENT OF AN (RT 








16 24 32 40 
INTERRESPONSE TIME (IRT) IN SEC. 


Fic. 4. The relative reinforcement of IRTs 
with different lengths. A: The average 
Reinfs./IRT given by various schedules. B: 
The average Reinfs./Hr. given by 5-min. VI. 
The solid line results when the IRTs/Op. in all 
4-sec.-wide bands is .4; the other curves result 
when the IRTs/Op. has the other values 
indicated. 


due to feeding since Ss ate only 1 gm. during the 
2-hr. session. This decline was reduced to a 
negligible size by shortening the bar session to 
l hr. The other Ss did not show this serious 
inhomogeneity, nor did S; and Sy outside of the 
6 days mentioned. However, Ss did sometimes 
make a few long IRT's toward the end of the bar 
session, or long IR'T's associated with drinking, 
defecation, etc. ‘There were seldom over four 
of these IRTs a day, so they had a negligible 
effect on IRTs/Op. values based on a moderate 
number of opportunities. To eliminate any 
important effect from these few long IRTs was 
another reason for plotting only IRTs/Op. 
values based on more than 20 opportunities per 
day. By using this minimum and by not using 
data from the 6 days of serious inhomogeneity 
with S, and Ss, serious distortion of the results 
by known inhomogeneity was prevented. 


Relation between IRTs/Op. curve and 
reinforcement distribution.—ls the change 
in the IRTs/Op. curve with exposure to 
VI related to the way VI allocates rein- 
forcements to different IRTs? Skinner 
has pointed out that FI favors the rein- 
forcement of long IRTs (4, p. 283 ff.). 
With FI, the assignment of a reinforce- 
ment is equally likely in any short time 
sample ¢ sec. long, provided that the 
choice of each sample is independent of 
the position of the assigning tape. Then 
the probability of the assignment of a 
reinforcement is twice as great in a time 
sample 2¢ sec. long, three times as great 
in 3¢ sec., etc. Thus the probability of 
an assignment bears a linear relation to 
the length of the sample time interval 
until the probability becomes one for 
sample intervals equal to or greater than 
the time between assignments. The 
IRTs may be considered time samples of 
this sort. If the IRT distribution pro- 
duced is independent of the FI assign- 
ment schedule, then the probability of 
reinforcement of an IRT will also be a 
linear function of the IRT length, since 
the reinforcement of an IRT depends 
only on whether a reinforcement is 
assigned during that IRT. Hence in 
large samples the number of reinforce- 
ments divided by the number of IRTs 
(Reinfs./IRT) will approximate the 
solid line in Fig. 4A. 

With 5-min. FI no more than one 
reinforcement can be assigned during a 
< 5-min. IRT, but with the 5-min. VI 
used in this study there were reinforce- 
ment spacings as short as 6 sec. Con- 
sequently in VI the subtractive term for 
the probability of more than one assign- 
ment during an IRT is necessary in 
computing the relation between Reinfs./ 
IRT and IRT length (Reinfs./IRT 
curve). Hence for the 5-min. VI sched- 
ule used here the Reinfs./IRT curve 
(dashed line in Fig. 4A) lies slightly 
below the FI curve. In contrast, FR 
and VR give equal Reinfs./IRT at 
different IRTs (Fig. 4A). These Re- 
infs./IRT curves are determined by the 
schedule, and are not affected by S’s 





REINFORCEMENT OF INTERRESPONSE TIMES 


responding as long as the responding is 
independent of the schedule (VI re- 
sponding has a minute, trivial effect on 
the Reinfs./IRT). 

Are the data consistent with the 
assumption that the IRT distribution is 
independent of the reinforcement sched- 
ule? This stipulation requires that S§ 
neither discriminate the reinforcement 
schedule, nor in any other way con- 
sistently respond differently in different 
portions of the schedule. To prevent 
discrimination of the reinforcement 
schedule, VI was used throughout. 
Cumulative response records, which fa- 
cilitate the detection of systematic 
changes, indicated that any correlation 
between responding and the schedule was 
of minor importance. There was a slight 
correlation that seems unavoidable. To 
prevent slower responding after a rein- 
forcement, it is necessary to have two 
reinforcement assignments close enough 
to reinforce two successive responses. 
Then the IRT that receives the second 
reinforcement tends to be about 8 sec. 
long, the time Ss spend eating the first 
pellet. Since than 10% of the 
reinforcements are involved, this effect 
was neglected for the present. 

The assumption of independence was 
also tested by comparison of the observed 
number of reinforcements in each band 
of each § with the number predicted from 
the theoretical Reinfs./IRT curve (Fig. 
4A) and the observed number of IRTs. 
Data from the last 12 days of Treatment 
1 were used. To reduce an error due to 
grouping, 4-sec.-wide bands were used 
while > 7 reinforcements were predicted, 
but when <7 were predicted, 8-sec.- 


less 


wide bands were used as long as > 7 


reinforcements were predicted. The 
variance for each band was obtained 
from the binomial formula. Only two 
differences between observed and pre- 
dicted values were significant at the .05 
level (normal curve approximation). 
Since there were 38 comparisons, two 
differences of that size would be expected 
from sampling variation. Hence the 
dashed line in Fig. 4A appears to be a 
satisfactory description of the allocation 


153 
of reinforcements to different 
during 5-min. VI. 

We have just seen that during 5-min. 
VI the Reinfs./IRT curve increases with 
increasing IRT length (Fig. 4A), but we 
found earlier that Ss exposed to this 
schedule gave an IRTs/Op. curve that 
decreased with increasing IRT length 
(Fig. 1D). Thus the differential rein- 
forcement of the Reinfs./IRT curve is in 
the opposite direction from the differ- 
ential response that develops, and 
scarcely seems responsible for the dif- 
ferential response. This suggests that 
some other measure of reinforcement 
may be influencing Ss. The other 
reasonable measure is the rate of rein- 
forcement of IRTs in a given band, the 
Reinfs./Hr. (The sum of all the band 
Reinfs./Hr. values will be called the 
Total Reinfs./Hr. to distinguish it.) In 
each band the Reinfs./IRT are fixed at 
the values given in Fig. 4A, but the 
Reinfs./Hr. are not fixed; they vary with 
the number of IRTs per hour (IRTs/ 
Hr.). (For each band: Reinfs./IRT 
X IRTs/Hr. = Reinfs./Hr.) Only the 
Total Reinfs./Hr. remains constant at 
12 during 5-min. VI. However, we can 
determine the approximate shape of the 
Reinfs./Hr. curve (relation between 
Reinfs./Hr. and IRT length) for the 
first days of 5-min. VI since the IRTs/ 
Op. were then about equal in different 
bands. A value of .4 for the IRTs/Op. 
in all bands is fairly representative of the 
different Ss, and determines a certain 
average IRTs/Hr. in each band. From 
this IRTs/Hr. curve and the Reinfs./ 
IRT curve of Fig. 4A, we can calculate 
the theoretical Reinfs./Hr. curve, the 
solid line in Fig. 4B, which Ss encoun 
tered during the first days of 5-min. VI.’ 


IRTs 


7 The following calculation of the Reinfs./Hr. 
of 0-4 sec. IRTs when the IRTs/Op. in all 
bands are .4 will illustrate these computations. 
Consider a sample of LOOOIRTs. If the IRTs/ 
Op. in all bands are .4 there will be on the average 
1,000 * 4 = 400 0-4 sec. IRTs; (1,000 400) 
~ 4 = 1404-8 sec. IRTs; (1,000 — 400 — 240) 
«x 4 = 144 8-12 sec. IRTs; etc. Then the 
average time required for the IRTs in each band 


can be calculated: 400 * 2 sec. = BOO sec, re- 





154 


The other lines show the relation when 
the IRTs/Op. in all bands equals .3, .5, 
or .6. The Reinfs./Hr. decrease rapidly 
from about 4-8 sec. IRTs to longer IRTs, 
consequently the Reinfs./Hr. curve may 
be responsible for the observed changes in 
the IRTs/Op. curve since the IRTs/Op. 
increased where the Reinfs./Hr. were 
high, and decreased where the Reinfs./ 
Hr. were low. As soon as a peak de- 
velops in the IRTs/Op. curve at short 
IRTs, the Reinfs./Hr. curve will no 
longer follow the curves of Fig. 4B, but 
the increase in short IRTs will further 
accentuate the peak in the Reinfs./Hr. 
curve at short IRTs. The IRTs/Op. 
peak was usually found at 0-4 sec. IRTs 
whereas Fig. 4B shows the Reinfs./Hr. 
peak tends to be at 4-8 sec. IRTs. 
Probably Ss reacted to the major trend 
and missed the dip at the shortest IRTs. 
Sampling variation in the Reinfs./Hr. 
curve will tend to obscure the dip. If 
the rat does not react to the dip at first 
and produces an IRTs/Op. curve with a 
negative slope throughout, that will tend 
to produce a Reinfs./Hr. curve with a 0-4 
sec. peak, and so the 0-4 sec. IRTs/Op. 
peak may be perpetuated. On the other 
hand, Ss apparently did react to the 
lower Reinfs./Hr. for 0-4 sec. IRTs than 
for 4-8, since Ss for a long period had a 
lower IRTs/Op. at 0-4 sec. than at 4-8 
sec. (Fig. 3). However, the 0-4 sec. 
peak quickly developed when the rate of 
reinforcement in all bands was doubled. 
The increase in reinforcement would be 
expected to raise the IRTs/Op. in all 
bands, which, as Fig. 4B shows, steepens 
the major negative slope of the Reinfs./ 
Hr. curve, and reduces the dip at 0-4 


quired for 0-4 sec. IRTs etc. (All 0-4 sec. 
IRTs are considered 2 sec. long. This adds a 
small grouping error, but one that can be reduced 
to any desired size by employing narrower 
bands.) The average time required for 1,000 
IRTs is obtained by summing these times for 
each band until the remaining bands add a 
negligible amount. In this case the total time 
for 1,000 IRTs is 2.22 hr., so the IRTs/Hr. in 
the 0-4 sec. band are 400/2.22 or 180. The 
0-4 sec. IRTs received .00667 Reinfs./IRT 
(another small grouping error) so we have 
180 & 00667 = 1.20 Reinfs./Hr. for 0-4 sec. 
IRTs. 


DOUGLAS ANGER 


sec. (When the IRTs/Op. exceed .7, 
the Reinfs./Hr. peak lies in the 0-4 
sec. band.) 

The agreement of the IRTs/Op. 
curves with the Reinfs./Hr. curves and 
the lack of agreement with the Reinfs./ 
IRT curves indicates that the relative 
Reinfs./Hr., not the relative Reinfs./ 
IRT, determine the shape of the IRTs/ 
Op. curve. Since the relative Reinfs./ 
Hr. curve is also the curve of relative 
frequency of reinforcements during a 
common time interval, the agreement 
between IRTs/Op. and Reinfs./Hr. 
curves essentially indicates that the 
IRTs/Op. curve is controlled primarily 
by the difference in reinforcements at 
long and short IRTs, and is compara- 
tively little influenced by the difference 
in unreinforced long and short IRTs. 
Thus the results imply that the differ- 
ential response observed here is not due 
to the action of reactive inhibition. The 
results do not indicate that the unrein- 
forced responses have no effect, just that 
here the far greater number of unrein 
forced short IRTs has little influence on 


S’s relative IRTs/Op. at short and long 


IRTs. The lack of reinforcement for 
many responses may influence the be- 
havior in other respects, e.g., it may 
contribute to the decline in the last half 
of 2-hr. bar sessions. 

Possibly the change in the IRTs/Op. 
curve to a shape which roughly matches 
the Reinfs./Hr. curve is not due to the 
Reinfs./Hr. curve, and the match is 
fortuitous. The peak in the IRTs/Op. 
curve at short IRTs might be due to 
some other factor, e.g., an inherent differ- 
ence in the reinforcement requirements 
of short and long IRTs. This possi- 
bility can be eliminated by showing that 
modification of the reinforcement curves 
by E results in corresponding changes in 
the IRTs/Op. curves. Evidence that 
some reinforcement curve determined 
by the schedule affects the IRTs/Op. 
curve comes from Skinner’s finding (4) 
that FR produces higher response rates 
than comparable FI. Compared with 
FI, FR provides more Reinfs./IRT for 
short IRTs (Fig. 4A), and so for similar 


responding, the Reinfs./Hr. curves would 





REINFORCEMENT OF INTERRESPONSE TIMES 


have a much more pronounced peak at 
short IRTs. 

If the Reinfs./Hr. curve influences the 
IRTs/Op. curve, then a circular relation 
exists, since we saw earlier that the 
IRTs/Op. curve influences the Reinfs./ 
Hr. curve. Hence stability would be 
expected only after a long series of 
changes: the initial IRTs/Op. curve 
determines a Reinfs./Hr. curve, but in 
general this Reinfs./Hr. curve changes 
the IRTs/Op. curve, and the change in 
the IRTs/Op. curve changes the Reinfs./ 
Hr. curve, etc. Relative stability would 
result when the IRTs/Op. curve gen- 
erates a Reinfs./Hr. curve that produces 
the same IRTs/Op. curve. With this 
circular relation we would expect that 
behavior would require a long time to 
stabilize even approximately, and might 
always show considerable variability and 
drift due to chance fluctuation in the 
reinforcement curves. The results (Fig. 
2 and 3) seem to agree with these pre- 
dictions. The circular relation would 
not occur if the Reinfs./IRT curve was 
in control, since the Reinfs./IRT curve 
is fixed and independent of responding 
in the long run. 

This circular relation complicates the 
effect on responding of differences or 
changes in the VI rate (Total Reinfs./ 
Hr.). With fewer total Reinfs./Hr., the 
Reinfs./Hr. in each band will be fewer, 
the IRTs/Op. in each band will be lower, 
and so the peak in the Reinfs./Hr. curve 
will be less pronounced and at longer 
IRTs (Fig. 4B). Hence the peak in the 
IRTs/Op. curve should be less pro- 
nounced, be located at longer IRTs, and 
should develop slower (at some point it 
may not develop at all). Thus the 
differences in responding seen with 
different reinforcement rates are proba- 
bly partly due to differences in the 
Reinfs./Hr. curves. The data of Fig. 3 
support this view. 

The circular relation will tend to 
amplify the influence of other variables 
also. Apparently increased food depri- 
vation increases the VI response rate, 
probably by increasing the IRTs/Op. in 
all bands. Such an increase will ac- 
centuate the Reinfs./Hr. peak and move 


155 


it to shorter IRTs, which will further 
increase the response rate. 

The above presentation has been 
simplified by use of the theoretical 
Reinfs./Hr. curves. The actual Reinfs./ 
Hr. curves received by Ss were affected 
by appreciable sampling fluctuations 


which probably contributed to the 
variability of the response curves. How- 
ever, the observed Reinfs./Hr. curves 


clearly showed the same major trend as 
the theoretical curves, the decrease in 
Reinfs./Hr. with increase in IRT. Ob- 
served Reinfs./Hr. curves for the last 10 
days of 5-min. VI are shown in the top 
row of Fig. 5. The IRTs/Op. and the 
observed Reinfs./Hr. curves match well, 
but, of course, this match is partly due 
to the dependence of the Reinfs./Hr. 
curve on the IRTs/Op. curve arranged 
by the schedule, and partly due to the 
reciprocal dependence of the IRTs/Op. 
curve on the Reinfs./Hr. curve which the 
rat is responsible for. It is in the 
changes after conditioning that the roles 
played by each half of this circular rela- 
tion can be distinguished. Immediately 
after conditioning the Reinfs./Hr. curve 
depends on the IRTs/Op. curve, but the 
two curves are different, apparently 
because § has not yet reacted to the 
Reinfs./Hr. curve. Soon the IRTs/Op. 
curve changes until it is in rough agree 
ment with the Reinfs./Hr. curve. This 
change in responding can be attributed 
to the reaction of S to the Reinfs./Hr. 
curve, and cannot be attributed to the 
reciproc al dependence of the Reinfs Hr 
curve on the IRTs/Op. curve arranged 
by the schedule. 


Treatment 2: Reinforcement of Only 
> 40-Sec. IRTs on a 5-Min. VI 
Schedule (48 Days) 

The effect of this radical change in 
reinforcement is shown in the top two 
rows of Fig. 5. Toshow the IRTs out 
to 56 sec. in this figure, 8-sec.-wide 
bands were necessary instead of the 
4-sec.-wide ones used above. The 
light dotted Reinfs./Hr. curves show 
the change in treatment: the elimi- 
nation of reinforcements for < 40-sec. 





DOUGLAS ANGER 


LAST 10 DAYS OF TREATMENT | 
S- MIM, VARIABLE - INTERVAL REINFORCEMENT 


SUBJECT 4 “r SUBJECT 2 


oF 








+ 
o 


LAST 10 DAYS OF TREATMENT 2 
REINFORCEMENT OF ONLY »40 SEC IRTs ON A S- MIN VARIABLE ~INTERVAL SCHE 
a ao 


ro 





























"™ 
“ 


LAST 10 DAYS OF TREATMENT 3 
REINFORCEMENT OF ONLY >40 SEC ITS ON A 2.5-MIN VARIABLE - INTERVAL SCHEDULE 


oF 


REINFORCEMENTS/HR 


oF 


5 


iL 
id 
a3 
ip 
i 











DAYS OF TREATMENT 4 
REWFORCEMENT, SAME AS TREATMENT | 
se “ 














an 2 © ° 
ses mM HW Oe “nen eae 82 & 


. 
INTERRESPONSE TIME (IRT) IN SEC. 


Fic. 5. The effect of the different treatments (rows) on the IRTs/Op. curves of the four Ss 
(columns). ¥ Solid lines connect the pooled means for the 10-day periods, but are only given where 
there’were at least 70 opportunities during the 10 days. Dots show the daily values and are only 
given for bands with at least 20 opportunities each day. Light dotted lines show the mean Reinfs./ 


Hr. 





REINFORCEMENT OF INTERRESPONSE TIMES 


IRTs, and a Reinfs./Hr. increase for 
> 40-sec. IRTs. Before the change 
the IRTs/Op. curves (solid lines) of 
all Ss were higher at short IRTs than 
at long, but after the change the 
curves were higher at long IRTs. 
Analysis of variance using ranked 
data showed that during the last 10 
days of both Treatments 1 and 2, 
there were significant (P < .0O1) 
differences among bands for each S. 
during the last 10 days of Treatment 
1 with each S the 0-8 sec. IRTs/ 
Op. were significantly (P < .01, Wil- 
coxon’s T test) higher than the IRTs/ 
Op. in all other bands out to 48 sec. 
During the last 10 days of Treat- 
ment 2 with each S the 40-48 sec. 
IRTs/Op. were significantly (P 
< .01, Wilcoxon’s T test) higher than 
the 0-8, 8-16, and 16-24 sec. IRTs/ 
Op. The 0-8, 8-16, and 16-24 sec. 
IRTs/Op. of each S were significantly 
(P < .01, Wilcoxon’s 7 test) less dur- 
ing Treatment 2 than 1. The effect 
of the change on > 40-sec. IRTs/Op. 
was not clear due to the scarcity of 
opportunities for long IRTs during 
Treatment I. 


During the two 10-day periods just 
considered the same Total Reinfs./Hr. 
were given, but a considerable change 
was made in the allocation of reinforce- 


ments to different IRTs. A _ major 
change in the responding resulted, which 
maintained the IRTs/Op. peak in the 
region of maximum Reinfs./Hr. This 
evidence that sometimes Ss adjust their 
IRTs/Op. curve to the reinforcement 
curve supports the view that such adjust- 
ment occurs during 5-min. VI. How- 
ever, the reinforcement distribution of 
Treatment 2 is so different from that of 
5-min. VI that the adjustment produced 
by Treatment 2 does not show con- 
clusively that such adjustment is going 
on during 5-min. VI. To demonstrate 
adjustment during 5-min. VI it seems 
important to show that small experi- 
mentally induced changes in the rein- 


157 


forcement curves of 5-min. VI result in 
small corresponding changes in the 
IRTs/Op. curves. 

Treatment 2 provided no new evidence 
as to which reinforcement curve controls 
the IRTs/Op. curve, since during Treat- 
ment 2 both the Reinfs./Hr. and Reinfs./ 
IRT were zero out to 40 sec. and high 
beyond 40 sec. 

Although Treatment 2 changed the 
IRTs/Op. curves greatly, the adjust- 
ment stopped well short of the maximum 
adjustment possible as if checked by 
other factors. Generalization may be 
responsible for the high 32-40 sec. IRTs / 
Op., but the 0-8 sec. IRTs/Op. seem a 
different matter since they exceed the 
8-12 sec. IRTs/Op. for all Ss (Fig. 5). 
The 0-8 sec. IRTs/Op. of 8; and 8, were 
stable and especially high at the end of 
Treatment 2. 

Of course, the number of IRTs in 
various bands also changed radically 
during Treatment 2. Each S showed 
significantly less 0-8 and 8-16 sec. IRTs, 
and more 32-40 and 40-48 sec. IRTs 
(P < .01 except the 8-16 sec. IRTs of 8, 
where P < .02, Wilcoxon's T test). But 
the curves of the number of IRTs were 
not as orderly as the IRTs/Op. curves. 
All Ss showed a peak at long IRTs, and 
with S; the peak fell in the 40-48 sec. 
band, but with §,, S:, and S, the peak 
fell in the 24-32 or 32-40 sec. bands. In 
addition S;, Ss, and S, still showed a 
peak at 0-8 sec. IRTs which exceeded 
the peak beyond 40 sec., although the 
daily 0-8 sec. IRTs of each S numbered 
less than 1/12 of their Treatment-l 
values. The IRTs/Op. curves, on the 
other hand, had their maxima in the 
reinforced bands with all Ss. The more 
orderly picture presented by the IRTs/ 
Op. curves again suggests that they show 
the response of Ss to the reinforcing 
conditions better than curves of the 
number of IRTs which seem to be greatly 
affected by the decrease in opportunities 
with increase in IRT length. This 
decline in opportunities magnifies the 
0-8 sec. IRTs and counteracts the rise 
in IRTs/Op. between 32 and 48 sec., 
thereby placing the peak short of 40 sec. 

The 38 days which elapsed between 





158 


the two 10-day periods just compared 
were necessary for the responding to 
stabilize. Within the first 10 days of 
Treatment 2, an abrupt drop in the short 
IRTs/Op. and a rise in the long IRTs/ 
Op. occurred. Thereafter all Ss had the 
40-48 sec. IRTs/Op. average above all 
the shorter IRTs/Op. although the differ- 
ence was small at first. In Fig. 6, which 


wastwan' ? 





Fic. 6. Daily response rates during the last 
12 days of Treatment 1 and Treatments 2, 3, 
and 4, 


shows the daily response rates, this initial 
abrupt change and later slow change is 


somewhat visible. Figure 6 also shows 
the large size of the rate changes pro- 
duced by Treatment 2. 

The response rates of S,; and S; on 
5-min. VI were low enough so there were 
more than twelve > 40-sec. IRTs/Hr., 
hence the Total Reinfs./Hr. could be 
kept at 12 right after the change to 
Treatment 2. The higher response rates 
of S; and S, on 5-min. VI resulted in 
only about eight > 40-sec. IRTs/Hr. 
Thus at the start of Treatment 2, 8; and 
S, could only be given 8 Total Reinfs./ 
Hr., since the frequency of > 40-sec. 
IRTs/Hr. did not rise at once. But 
long IRTs did rise, so 12 Total Reinfs./ 
Hr. could be given S, after 8 days, and 
§, after 22 days of Treatment 2. This 
period of slight reduction in the Total 
Reinfs./Hr. of two Ss is not thought to 
have made any important difference in 
the results since the reduction occurred 
well before the final 10 days, and because 
the changes in responding continued 
after the Total Reinfs./Hr. had been 
restored to 12. 


DOUGLAS ANGER 


Treatment 3: Reinforcement of Only 
> 40-Sec. IRTs on a2.5-Min. VI 
Schedule (31 Days) 


The Reinfs./Hr. of > 40-sec. IRTs 
were doubled to determine the relative 
effect on > 40 and < 40-sec. IRTs. 
The effect on both was slight as the 
second and third rows of Fig. 5 show. 
There were some increases in the 
40-48 sec. IRTs/Op. (S; and S, 
significant, P < 01, Wilcoxon’s T 
test), and decreases in < 40-sec. 
IRTs/Op. (perhaps due to the 31 
days more of extinction), so the 
IRTs/Op. of all Ss rose somewhat 
more abruptly at 40 sec. The further 
extinction did not lower the high 0-8 
sec. IRTs/Op. of S,; and S,. 


The results from Treatments 2 and 3 
show that rats can distinguish 40-sec. 
intervals from shorter ones with fair 
accuracy. During Treatment 3 there 
was only an 8-sec. difference between the 
24-32 sec. IRTs, where the IRTs/Op. 
were nearly minimal with most Ss, and 
the 40-48 sec. IRTs where the IRTs/ 
Op. were nearly maximal. A careful 
check revealed no changes in S’s environ- 
ment correlated with the time since the 
last bar press. How then did the rats 
accomplish this timing? One possibility 
is that after a bar press a sequence or 
chain of responses other than bar pressing 
occurred and led to another bar press 
after a certain time interval. Sequences 
that spaced presses more than 40 sec. 
apart would be reinforced, and hence 
might increase in frequency compared 
with the unreinforced sequences that 
did not separate presses by 40 sec. 
Wilson and Keller (6) reported recog- 
nizable and predictable sequences of 
responses other than bar pressing be- 
tween bar presses. They first reinforced 
only > 10-sec. IRTs, and gradually 
shifted to > 30-sec. IRTs. In the 
experiment reported here such recog- 
nizable and predictable behavior was not 
found. Most Ss were active between 
bar presses, but the activity was highly 
variable and unpredictable. During 





REINFORCEMENT OF INTERRESPONSE TIMES 


some 40-44 sec. IRTs S would groom, 
chew on the cage bars, and scratch; 
during other 40-44 sec. IRTs he would 
groom and smell in the food chamber; 
during others he would sniff about the 
cage, etc. Many different sequences oc- 
curred during IRTs of the same length, 
and the only visible difference between 
the behavior during 40-44 sec. IRTs and 
that during other IRTs (except very 
short IRTs) was the fact to be explained, 
that the bar press came at adifferent time. 
Furthermore, S; sat quietly, with eyes 
open, between almost all presses. About 
the only movements S; made were those 
of bar pressing, obtaining food pellets, 
and eating them. Other Ss showed 
similar inactivity during some > 40- 
sec. IRTs. 

Perhaps these results differ from those 
of Wilson and Keller because they began 
with 100% reinforcement of > 10-sec. 
IRTs and gradually changed to 100% 
reinforcement of > 30-sec. IRTs. Short 


sequences of behavior established during 
reinforcement of > 10-sec. IRTs may 
have been gradually extended. 


In the 
present experiment, Ss were suddenly 
changed from reinforcement of mostly 
short IRTs to reinforcement of > 40-sec. 
IRTs, a longer delay than tried by 
Wilson and Keller. In addition Ss did 
not receive 100% reinforcement of > 40- 
sec. IRTs owing to the upper limit on 
Total Reinfs./Hr. (usually S, and S; had 
helow 50%, while S; and S, had 50- 
100%). Whatever the source of the 
difference though, the present experi- 
ment indicates that rats can time 40-sec. 
intervals fairly accurately by some means 
other than a sequence of overt responses 
during the interval. Apparently rats 
have available some internal variable 
that changes with the time since the last 
response. This variable may function 
like an external stimulus in that a differ- 
ence in the reinforcement of responses at 
different values of this variable results 
in a higher probability of response at the 
values reinforced more. 

The behavior between bar presses may 
affect the IRTs/Op. curve even though 
it does not account for the 
When only > 40-sec. 


timing. 
IRTs are rein- 


159 


forced, behavior other than bar pressing 
always precedes reinforced bar presses 
and will be reinforced somewhat. The 
interference or competition of this be- 
havior with bar pressing may lower the 
IRTs/Op., even though this behavior is 
quite variable and does not account for 
the accuracy of the timing. The in- 
activity of S; and the inactivity of all Ss 
during some > 40-sec. IRTs, shows that 
either interfering behavior is not essential 
or that even inactivity can interfere with 
bar pressing. 


Treatment 4: Return to 5-Min. V1 
with No Other Restrictions on 
IRTs Reinforced (30 Days) 

Return to the conditions of Treat- 

ment | eliminated the peak at > 40- 

sec. IRTs, and resulted in either the 

decreasing IRTs/Op. curve charac- 
teristic of Treatment | or a flat curve 

(Fig. 5). Again a change in the 

Reinfs./Hr. curve produced a radical 

adjustment of the IRTs/Op. curve 

that brought it into rough agreement 
with the Reinfs./Hr. curve. The 

IRTs/Op. curves changed a large 

part of the way back to the curves 

of Treatment 1, but with all Ss, the 

0-8 sec. IRTs/Op. were significantly 

(P < Ol, Wilcoxon’s T test) 

during Treatment 4 than during 

Treatment 1. Several bands were 

significantly less for some Ss. By 

the end of Treatment 4 the behavior 
was relatively stable (Vig. 6), and 
any further change would have been 
slow. It is possible that these differ- 
ences might have occurred without 

the intervening extinction of < 40- 

sec. IRTs, since data are not available 

on the change in 5-min. VI responding 
over such long time intervals. How- 
ever, the differences were large for 

S,, Se, and S, (Fig. 6). Hence it 

seems likely that more than the usual 

variability of behavior during 5-min. 

V1 is involved. 


less 


Probably the extinction of < 40-sec. 
IRTs or the high reinforcement of > 40- 





160 


sec. IRTs had some effect which was not 
reversed by Treatment 4. The circular 
relation of VI provides a possible ex- 
planation. We have seen that the 
Reinfs./Hr. curve probably affects the 
IRTs/Op. curve via the rats, and the 
IRTs/Op. curve affects the Reinfs./Hr. 
curve via the schedule. A succession of 
changes may finally produce a pair of 
Reinfs./Hr. and IRTs/Op. curves that 
tend to maintain each other (a semi-stable 
pair). But there may be many such 
semistable pairs. Beginning with low 
values of IRTs/Op. at short IRTs as in 
Treatment 4, the semistable pair that 
occurs first may be different than in 
Treatment 1 where the IRTs/Op. began 
at higher values. Under some condi- 
tions there may not be any semistable 
pair. Fluctuation between different be- 
haviors may result then. 


Treatment 5; Continuation of Treat- 
ment 4, but with Four I-Hr. Bar 
Sessions Daily (24 Days) 


This program was used in order to 
stabilize behavior faster after treat- 
ment changes. For 9 days the four 
l-hr. bar sessions were 3 hr. apart 
with the last ending 3 hr. before the 
feeding period. With this program 
all Ss responded significantly (P 
< .O1, Wilcoxon’s T test) less during 
the first session of the day than during 
two or more other sessions. To 
reduce this inhomogeneity, during the 
remaining 15 days the four I-hr. bar 
sessions were spaced 2 hr. apart with 
the last ending 14 hr. before the 
feeding period. With this program 
the inhomogeneity was small and 
tolerable, although two Ss had re- 
sponses in the fourth session slightly 
but significantly (P < .01, Wilcoxon’s 
T test) higher than the others. 


Treatment 6: 


Reinforcement of Only 
< 28-Sec. IRTs on a 5-Min. VI 
Schedule (13 Days) 


This treatment was given to deter- 
mine whether reinforcement of only 


DOUGLAS ANGER 


short IRTs raises response rates as 
effectively as elimination of reinforce- 
ment for short IRTs lowers rates. 
The change in treatment did increase 
(P < .01, Wilcoxon’s T test) the 0-4 
sec. IRTs/Op. of all Ss, but the IRTs 

Op. beyond 4 sec. either remained low 
or dropped slightly. Hence the time 
spent daily in long IRTs remained 
high or increased, and there was little 
or no increase in the response rate. 
One reason why raising the response 
rate is more difficult apparently lies in 
the differential reinforcement of se- 
quences of IRTs. During Treatment 
6, < 16-sec. IRTs were more probable 
following > 16-sec. IRTs than follow- 
ing < 16sec. IRTs. The differences 
were large for S, and S; and small for 
S, and S,, but all were significant at 
the .O1 level by a chi-square test, 
except S; only reached the .05 level. 
During Treatment 5, all Ss had shown 
slight but not significant differences 
in the opposite direction, i.c., < 16- 
sec. IRTs were more probable fol- 
lowing < 16-sec. IRTs than following 
> 16-sec. IRTs. 


These observations correspond to the 
fact that during reinforcement of < 28- 
sec. IRTs on a VI schedule the rein- 
forcement of < 16-sec. IRTs is greater 
following long IRTs than following 
short. This can be seen from the fol- 
lowing: (@) During VI the probability 
of assignment of a reinforcement during 
an IRT is greater the longer the IRT 
until the probability reaches one (Fig. 
4A). (6) But when only < 28-sec. 
IRTs are reinforced, then reinforcements 
assigned during > 28-sec. IRTs are not 
delivered until the next < 28-sec. IRT 
occurs. (c) Hence a < 16-sec. IRT 
following a long IRT receives the 
Reinfs./IRT of an IRT as long as the 
sum of the long and short IRT. This is 
far greater than the Reinfs./IRT of a 
< 16-sec. IRT following another < 16- 
sec. IRT which is just the low Reinfs./ 
IRT of < 16-sec. IRTs. (d) When this 





REINFORCEMENT OF INTERRESPONSE TIMES 


treatment is applied to the low response 
rate of S; and S,; where the number of 
long-short IRT sequences is not very 
different from the short-short sequences, 
then the Reinfs./Hr. variable will also 
strongly favor long-short sequences. 
When applied to the higher response 
rates of S; and S, which have more short- 
short sequences than long-short, then the 
Reinfs./Hr. advantage of long-short 
sequences is less, but still present. 
Consequently during reinforcement of 
only short IRTs on a VI schedule, Ss 
made two or three responses in succession 
where before they had made one. The 
short IRTs increased, but the long IRTs 
were little affected. Apparently re- 
sponding is affected not only by the 
length of the IRT preceding reinforce- 
ment, but also by the length of the 
“second-back” IRT, at least when the 
immediately preceding IRT is short. 
To raise the frequency of short IRTs 
when many long IRTs are present, it 
seems necessary to reinforce sequences 
of short IRTs as often or more often than 
long-short sequences. 


SUMMARY 


This study demonstrates that under some 
conditions rats adjust the time intervals between 
bar presses, the interresponse times, according 
to the relative frequency with which reinforce- 
ments have followed the various interresponse 


times. The results shed light on the nature of 
this adjustment, and indicate that it occurs 
during some interval schedules. 

The interresponse times made by four rats 
were measured during 5-min. variable-interval 
reinforcement. Short interresponse times were 
found to greatly outnumber long ones, probably 
because there are more opportunities for short 
interresponse times. This difference in oppor- 
tunities can be corrected for by calculation of the 
Interregponse ‘Times/Opportunity. This vari- 
able is an estimate of the probability of response 
at different times after the last response, given 
an opportunity. 
some purposes 


Evidence is presented that for 
this variable is a more useful 
measure of the responding in this situation than 
is the number of interresponse times. 

At first after conditioning the probability of 
response was about equal at different times after 
the last response, but continued exposure to the 


161 


variable-interval schedule produced a higher 
probability soon after the last response and a 
lower probability at longer times after the last 
response. Analysis of the reinforcements given 
different interresponse times by the schedule 
showed that the Reinforcements/Interresponse 
Time are greatest for long interresponse times, 
but that the Reinforcements/Hr 
for short interresponse times. The agreement 
between the greater Reinforcements/Hr. for 
short interresponse times and the development 
of a higher probability of response soon after the 
last response suggests that the relative Rein 
forcements/Hr., not the Reinforce 
ments/Interresponse Time, determine the re 
sponse probability. Control by Reinforce 
ments/Hr. is surprising since it indicates that 
the occurrence of far more unreinforced short 
interresponse-times has little influence on the 
animal’s relative probability of 
different times after the last response. 
the Reinforcements/Hr. for different inter 
response times are partially dependent on the 
responding as well as vice versa, there seems to 
be a circular relation between reinforcements and 
responding during variable-interval reinforce 
ment. 

Reinforcement of only long interresponse 
times greatly reduced the response rate and 
shifted the highest probability of response to 
long intervals after the last response. The 
rather accurate timing shown by Ss apparently 
was not due to sequences of overt behavior 
between bar presses. 


are greatest 


relative 


response at 
Since 


REFERENCES 


Fevrer, W. An introduction to probability 
theory and its applications. New York 
Wiley, 1950. 

Friepman, M. The use of ranks to avoid the 
assumption of normality implicit in the 
analysis of variance. J. statist 
Ass., 1937, 32, 675-701, 

Muvetrer, C. G. Theoretical relationships 
among some measures of conditioning 
Proc. Nat. Acad. Sci., Wash., 1950, 3%, 
123-130. 

Skinner, B. F 


Amer 


The behavior of organisms 
New York: D. Appleton-Century, 193% 
Witcoxon, F. Some rapid approximate sta 
tistical procedures New York 

can Cyanamid Co., 1949. 

. Witson, M. P., & Kecrern, F. S. On the 
selective reinforcement of spaced re 
sponses. J.comp. physiol. Prychol., 1953, 
46, 190-193. 


(Received July 25, 1955) 


Ameri 





Journal of Experimental Psychology 
Vol. 52, No. 3, 1956 ” 


REVERSAL AND NONREVERSAL SHIFTS IN CONCEPT 
FORMATION WITH PARTIAL REINFORCEMENT 
ELIMINATED! 

ARNOLD H. BUSS 
Carter Memerial Hospital, Indianapolis 


A previous paper by the writer (1) 
investigated the effect of reversal and 
nonreversal shifts on the learning of 
successive discriminations. Reversal 
shift involves learning a second dis- 
crimination that is opposite to the 
first, e.g., in Discrimination 1 black 
is the positive stimulus and white is 
the negative stimulus, and in Dis- 
crimination 2 white is positive and 
black negative. Nonreversal shift in- 


volves a change in the dimension of 
the stimuli being discriminated, e.g., 
in Discrimination 1 the dimension is 
achromatic color (black vs. white) and 
in Discrimination 2 the dimension is 
shape (circle vs. square). 


It was 
found that reversal shift occurred 
significantly faster than nonreversal 
shift. The slowness of the nonre- 
versal group in shifting was attributed 
to partial reinforcement of the first 
discrimination during the learning of 
the second discrimination. This par- 
tial reinforcement served to maintain 
the previously learned discrimination 
and to retard learning of the sub- 
sequent discrimination. The reversal 
group did not receive such partial 
reinforcement in the second discrimi- 
nation, and their new learning was 
not retarded. 

Recently, Kendler and D’Amato 
(2) advanced a mediational theory of 
concept formation that involved a 
different explanation for the slow 
shifting of the nonreversal group. 
They argued that partial reinforce- 

1 The writer is indebted to Dr. W. K. Estes 


and Dr. Harvey Armus for their comments and 
suggestions, 


ment is not the crucial variable but 
that “the important factor is the 
presence of the appropraite verbal 
cues for the reversal group when the 
learning of the second concept is 
initiated” (2, p. 166). When the 
effects of partial reinforcement were 
ostensibly eliminated, it was found 
that the reversal shift was still sig- 
nificantly faster than the nonreversal 
shift. Kendler and D’Amato con- 
cluded that partial reinforcement 
could not account for their results. 
This conclusion rests on the assump- 
tion that partial reinforcement was 
eliminated in the nonreversal shift. 
The writer believes that careful ex- 
amination of the Kendler-D’Amato 
experiments will cast doubt on this 
assumption. Only their first experi- 
ment need be examined for the present 
purpose. 


Two stimulus cards were used; one 
was chromatic and rectilinear (an orange 
diamond), and the other achromatic and 
curvilinear (a dark gray ellipse). The 
cards could be sorted for shape (recti- 
linear under one stimulus and curvilinear 
under the other) or for color (chromatic 
under one stimulus and achromatic under 
the other). Also possible were a reverse 
shape sorting (rectilinear cards under 
the curvilinear stimulus and curvilinear 
cards under the rectilinear stimulus) and 
a reversed color sorting (chromatic cards 
under the achromatic stimulus and 
achromatic cards under the chromatic 
stimulus). 

Half the Ss learned a shape concept 
first and the other half a color concept. 
We need consider only those learning a 
shape concept first, since the same line 





CONCEPT FORMATION 


of reasoning will apply to those learning 
the color concept first. Of the Ss 
learning a shape concept first, half then 
learned a reverse shape concept (reversal 
shift), and half learned a reverse color 
concept (nonreversal shift). 

During the learning of the second 
concept the nonreversal group would 
receive partial reinforcement of the 
previously learned concept. For 
example, sorting a rectilinear achromatic 
card under the rectilinear chromatic 
stimulus would be correct during the 
learning of the first concept (shape). 
Such sorting would also be correct 
during the learning of the second concept 
(reverse color) because the color of the 
card (achromatic) would be the reverse 
of the color of the stimulus card 
(chromatic). 

The next section of the design is 
critical: 


“In fact, all rectilinear achromatic and 
curvilinear chromatic cards would provide 
partial reinforcement effects for the nonreversal 
Ss during their learning of the second concept. 
Those cards were therefore eliminated during 
the initial state of learning the second concept. 
This resulted in Ss of both the reversal and 
nonreversal groups receiving 100% nonrein- 
forcement of their sorting responses which had 
been correct for the first concept. The elimi- 
nation of these aforementioned cards had also 
other consequences. Only rectilinear chromatic 
and curvilinear achromatic cards were left in the 
response deck during the first stage of the second 
concept. Since these cards resembled one of the 
two stimulus cards (orange diamond and gray 
ellipse) both in terms of shape and color, they 
would be appropriately sorted below the same 
stimulus card for both the reverse shape and 
reverse color concept. That is, during this first 
stage, the sorting responses of the response cards 
in the abridged deck would be identical for both 
reverse shape and reverse color concepts. . . .in 
the first stage of the second concept both reverse 
shape and reverse color sorting responses are 
correct for all Ss. 

“This condition necessitated the reinsertion 
of the discarded cards during the second stage 
of learning the second concept. In order to 
compare the relative effectiveness of a reversal 
and nonreversal shift, it was necessary to have 
response cards which required different sorting 
responses for the reverse color and shape con- 
cept” (2, pp. 166-167). 


TABLE 1 


Desicn ror Groups LEARNING THE 
Suare Concert Fiast 


First 


Concept Second Concept 





Series 2A 
reverse shape 
or 
reverse color 


Series 1 





Reversal shape 


Nonreversal] shape | reverse shape 
or 


reverse color 














The design is schematized in Table 1. 
In the initial stage of learning the second 
concept (Series 2A) there was no partial 
reinforcement of the previously learned 
concept for either group. However, in 
eliminating partial reinforcement, the 
experimental conditions were made iden- 
tical for both groups. Until the re- 
inserticn of the rectilinear chromatic and 
curvilinear achromatic cards in the 
second stage of learning the second 
concept (Series 2B), there is no difference 
between the reversal and nonreversal groups. 
In Series 2A both groups must have 
learned the same concept, although the 
design does not permit us to discover 
whether the concept was reverse shape 
or reverse color. 

However, there are rational grounds 
for assuming that in Series 2A both 
groups learned a reverse shape concept. 
In Series 1 both groups learned that of 
all the stimulus properties, only shape 
was relevant; color and size were ir- 
relevant to the discrimination. In Series 
2A the shape concept they had previously 
learned was extinguished, and they were 
required to learn a new concept. Since 
color and size were previously irrelevant 
and shape relevant, it seems likely that 
shape would be retained as the critical 
stimulus property, with the appropriate 
change to reverse shape concept. No 
doubt some Ss would switch to color and 
learn a reverse color concept, but the 
majority of Ss should learn a reverse 
shape concept. 

If most Ss in the reversal group 





164 


learned a reverse shape concept in Series 
2A, they should require no further 
learning trials to reach the criterion for 
a reverse shape concept in Series 2B. If 
most Ss in the nonreversal group learned 
a reverse shape concept in Series 2A, 
they would have to learn a new concept 
(reverse color) in Series 2B. Further- 
more, learning a reverse color concept in 
Series 2B would involve partial reinforce- 
ment of the reverse shape concept pre- 
sumably learned in Series 2A. If we 
assume that both groups learned a 
reverse shape concept in Series 2A, the 
shift from Series 2A to Series 2B is a 
nonreversal shift with partial reinforce- 
ment for the nonreversal and is no shift 
at all for the reversal group. Thus, the 
finding that the nonreversal group learns 
significantly slower than the reversal 
group in Series 2B might be attributed 
to the effects of partial reinforcement in 


Series 2B. 


The foregoing explanation rests on 
the assumption that in Series 2A a 
reverse shape discrimination was 
The fol- 
lowing experiment was designed to 
check on this assumption. 


learned by both groups. 


Meruop 


Subjects.—The Ss were 31 men and women, 
students at Indiana University, who volunteered 
to serve as Ss. Six Ss had to be eliminated from 
the study, four because they failed to learn the 
first concept and two because they misunder- 
stood later instructions. 

Stimuli.—The stimuli were wooden blocks of 
various shapes, colors, areas, and heights. The 
shapes were circular and angular (square or 
triangular), and the colors were light (yellow) 
and dark (biue or green). There were four 
heights and three top surface areas, and every 
block differed from every other block in at least 
one stimulus property. 

Design.—There were three series of stimuli. 
Series 1 consisted of 30 stimuli; there were 10 
circular, 10 square, and 10 triangular blocks. 
Color, height, and area were evenly divided 
among the various shapes. The Ss were re- 
quired to learn a shape concept (circular 
positive, angular negative) to a criterion of 10 
consecutive correct trials. Then without in- 
forming the Ss, Series 2 was instituted. 


ARNOLD H. BUSS 


Series 2 consisted of 32 stimuli; all circular 
stimuli were dark (blue or green) and all angular 
stimuli (squares or triangles) were light (yellow). 
The Ss were required to learn cither a reversed 
shape concept (circular negative, angular 
positive) or a color concept (light positive, dark 
negative) to a criterion of 10 consecutive correct 
trials. Since the reversed shape and the color 
concepts were completely confounded in Series 
2, it was necessary to have a third series in order 
to discover which concept was being learned. 

Series 3 consisted of 24 stimuli; all circular 
stimuli were light (yellow) and all angular 
stimuli were dark (blue or green). The E 
instructed Ss to continue responding in Series 3 
as they had in Series 2. ‘Those who learned a 
reverse shape discrimination in Series 2 would 
respond positively to dark angular stimuli and 
negatively to light circular stimuli, while those 
who had learned a color discrimination would 
respond in an exactly opposite fashion. ‘Thus, 
Ss’ performance in Series 3 should reveal what 
is learned in Series 2. 

Procedure.—The E read the following in- 
structions: “I am going to show you a series of 
wooden blocks, one at a time. Some of the 
blocks are Vec (V-E-C), and some are not-Vec? 
You don’t know what a Vec is now, so you'll 
have to guess. I'll tell you whether you are 
right or wrong, and in that way you will find out 
what a Vec is.” Each stimulus was presented 
for approximately 5 sec. If S did not reach the 
criterion of learning in Series 1 by Trial 30, the 
30 stimuli of Series 1 were presented again. Any 
S who did not reach the criterion by Trial 60 
was eliminated from the study. Four Ss were 
thus eliminated. Similarly, Series 2 was re- 
peated once, but no Ss had to be eliminated 
because of failure to learn by the sixty-fourth 
trial. 

At the beginning of Series 3, E said, “Now 
just keep on responding the way you are.” ‘The 
S was no longer told whether he was right or 
wrong. At the end of Series 3 each S was asked 
what a Vec was, if the concept had changed, and 
if so, what a Vec was at first. Two Ss mis- 
interpreted the instruction “Now just keep 
responding the way you are,” and they reported 
returning to the concept learned in Series 1; 
these Ss were discarded. 


REsULTs 


The responses of Ss in Series 3 
placed them unequivocally in either 


*The Vec response is designated as the 
positive response and not-Vec as the negative 
response. 





CONCEPT FORMATION 


the reverse shape or the color group, 
and their verbalizations accurately 
reflected their response tendences. 
There were 18 Ss in the reverse shape 
group and 7 Ss in the color group. 
This difference yields a chi square of 
4.0, which is significant at the .05 
level with 1 df. Thus, when given 
the option of learning a reverse shape 
or color concept in Series 2, a large 
majority of Ss learned the reverse 
shape concept. 

Are there any differences between 
groups in their learning? The re- 
sponse measure used is the number of 
trials, excluding the criterion trials 
required to achieve the performance 
criterion. In Series 1 there were 
several extreme scores in each group, 
and therefore the median is used. 
The median numbers of trials to learn 
are 2 and 3 for the reverse shape and 
color groups, respectively. Thus, 
both groups learned the shape concept 
quickly, and their speeds of learning 
did not differ appreciably. 

In the second series, 18 Ss learned a 
reverse shape concept and 7 a color 
concept. The relevant data for this 
series are shown in Table 2. Both 


TABLE 2 


Numper or Triats To Reacn Criterion 
Durinc Series 2 


| 
Group | Mean 


SD | stedian | Range | P 


Reverse shape | 5.5 | 5.6 3 l- 


Color ' 5.8 


distributions were mildly skewed but 
since the deviation from normality 
was not extreme, a ¢ test was used. 
The ¢ of 2.26 is significant at the .05 
level with 23 df, indicating that the 
reverse shape group learned signifi- 
cantly faster than the color group in 
Series 2. 


Discussion 


In interpreting the Kendler-D’Amato 
experiments it was assumed that the 
majority of Ss would learn a reverse 
shape concept in the second series. The 
data of the present experiment are con 
sistent with this assumption. Thus, 
partial reinforcement could have been a 
variable in the Kendler-D’ Amato experi 
ments, and their data do not necessarily 
prove the superiority of a mediational 
approach to concept formation. 

In the present study there was no 
control group for the learning of reverse 
shape or color in the absence of previous 
learning. We might speculate that the 
majority of Ss would have learned a 
reverse shape concept in Series 2 whether 
or not they had learned a shape concept 
in Series 1. If most Ss had an initial 
tendency to learn a reverse shape con 
cept, they would learn a shape concept 
quickly. If these Ss were to learn a 
reverse color concept with no prior 
learning, they would first have to ex- 
tinguish their tendency to respond to 
shape, which would retard their learning 
of a color concept. Thus if there were 
an initial tendency to learn a shape con 
cept, it would be learned considerably 
faster than a color concept. However, 
if there were no differenc em the speed 
of learning reverse shape and color con- 
cepts, we could conclude that there were 
no initial tendencies to respond prefer- 
entially to either shape or color. 

The Kendler and D'Amato article (2) 
contains data pertinent to this issue. 
They had one group of Ss learn a reverse 
shape concept and another group learn 
a reverse color concept, both in the 
absence of previous learning. The me 
dian number of reach the 
criterion was identical for both groups 
(see their Table 3, p. 168). This finding 
suggests that there are no preferential 
tendencies in the initial 
reverse shape or color. 


trials to 


learning of 
Thus the tend 
ency of most Ss to learn a reverse shape 
concept in Series 2 of the present study 
may be ascribed to their prior learning of 
a shape concept in Series 1. 

In the present 


experiment partial 








166 


reinforcement was completely eliminated. 
Yet the reversal group learned the second 
concept significantly faster than the 
nonreversal group. In the absence of 
partial reinforcement a nonmediational 
approach cannot account for this finding. 
The writer knows of no principle of 
simple learning that would predict these 
results. On the other hand the me- 
diational theory of Kendler and D’Amato 
would predict these results because 
“, .. at the completion of the learning 
of the first concept the implicit cue 
appropriate to the second concept would 
be present for Ss in the reversal group; 
they would merely be connected to the 
‘wrong’ sorting response” (2, p. 165). 
Thus the results of the present study 
unequivocally support the Kendler- 
D’ Amato position on concept formation.’ 


SUMMARY 


Studies reported by Kendler and D’Amato 
(2) ostensibly eliminated the effects of partial 


* Dr. H. H. Kendler, in a personal communica- 
tion, requested permission to include the follow- 
ing comment in hope that it might clarify the 
issues discussed in this article. He writes, “I do 
not basically disagree with Buss’s analysis of the 
Kendler-D’Amato study. I do object to his 
failure to maintain a sharp and clear distinction 
between the assumptions and implications of a 
single unit S-R formulation, which he adopted 
initially to explain the superiority of a reversal 
shift over a nonreversal shift, and the media- 
tional hypothesis (sequential S-R formulation) 
that we proposed to explain the same phenome- 
non. By stating now that, ‘there are rationale 
grounds for assuming that in Series 2A both 
groups learned a reverse shape concept’ Buss 
unconsciously adopts a mediational approach. 


ARNOLD H. BUSS 


reinforcement during a nonreversal shift in 
concept formation. It seems likely that partial 
reinforcement was not eliminated. 

The present study was designed to investigate 
reversal and nonreversal shifts in the absence 
of partial reinforcement. It was found that a 
reversal shift resulted in significantly faster 
learning than a nonreversal shift. This finding 
was interpreted as supporting the Kendler- 
D’Amato mediational theory of concept 
formation. 


REFERENCES 


1. Buss, A. H. Rigidity as a function of re- 
versal and nonreversal shifts in the 
learning of successive discriminations. 
J. exp. Psychol., 1953, 45, 75-81. 

2. Kenpter, H. H., & D’Amato, M. F. A 
comparison of reversal shifts and non- 
reversal shifts in human concept for- 
mation. J. exp. Psychol., 1955, 49, 165- 
174. 


(Received September 6, 1955) 


The effect of this is that the mechanism of partial 
reinforcement operates differently in his two 
studies. In the first study (1) he assumed the 
directly observable responses of the nonreversal 
groups (releasing one of two keys) were being 
partially reinforced. In this study he assumes 
the implicit mediating responses of the non- 
reversal group are being partially reinforced. 
S-R language, if it is not to be vague and mis- 
leading, requires precise and consistent definition 
of responses. The important point however is 
that there now seems to be agreement that the 
superiority of a reversal over a nonreversal shift 
in human concept formation requires (a) a medi- 
ational hypothesis and (b) cannot be explained 
solely by the operation of partial reinforcement 
of either the directly observable responses or the 
implicit mediating responses.” 








Journal of Experimental Psychology 
Vol. $2, . 3, 1956 


EFFECTS OF AMOUNT OF REINFORCEMENT AND OF 
PRE- AND POSTREINFORCEMENT DELAYS ON 
LEARNING AND EXTINCTION! 


ELIZABETH FEHRER 
Brooklyn College 


The problem investigated in the 
two experiments reported in this 
paper arose during a survey of studies 
dealing with the effects of amount of 
reinforcement on learning and ex- 
tinction. In the majority of maze 
and runway experiments (e.g., 9, 14, 
19) animals rewarded with different 
amounts of food have been left in the 
goal box until this was eaten, and then 
were removed. This same procedure 
was presumably followed in other 
studies (e.g., 1, 4, 18) in which the 
amount of time Ss were left in the 
goal box was not reported. Time 
spent in the goal box has therefore 
been a variable in addition to the 
amount of primary reinforcement, the 
sight of various sizes or numbers of 
pellets, and the amount of consum- 
matory activity. The positive rela- 
tion typically found in these studies 
between amount of reward on the one 
hand and measures of learning and 
resistance to extinction on the other 
may have been influenced by the 
length of time Ss were rewarded in 
the goal box since the secondary 
reinforcement value of the stimuli 
from a place should be related to 
length of exposure to them. 

This positive relation between 
amount of reward and runway speeds 
is not so clearly evident in two studies 
in which Ss fed different amounts were 
left in the goal box for the same length 


1 The writer wishes to express her appreciation 
to Helen Ehrlich and William Hodos for their 
invaluable assistance in running the majority of 
the animals in Exp. I. 


167 


of time. Maher and Wickens (11) 
found that running speeds during 20 
learning trials distinguished the high- 
and low-reward groups on the last 
two trials only, and this one significant 
difference (P = .05), as the authors 
point out, could have occurred by 
chance. Lawson (10) found that 
high-reward Ss ran more rapidly 
during learning but not during ex- 
tinction. However, as a very large 
number of Ss was discarded and as 
the groups in which they had belonged 
were not indicated, the generality of 
the findings is open to question. The 
lack of clear-cut differences between 
high- and low-reward groups in these 
two studies suggests that length of 
exposure to goal-box cues during 
rewarded trials may be a more im- 
portant factor in learning and extinc- 
tion than the amount of reward, and 
that it may be immaterial whether 
such exposure takes place only while 
eating (high-reward group) or while 
eating and then waiting after cating 
(low-reward group). This interpre- 
tation, however, is not supported by 
an experiment by Davis (3) in which 
groups fed the same amount were left 
in the goal box for different amounts 
of time, since he found that postrein- 
forcement delay had no effect on 
learning. He did not, however, meas- 
ure its effect on extinction. 

The chief purpose of our first 
experiment was to determine the 
effects of amount of reinforcement 
and of time spent in the goal box on 
learning and extinction. ‘Time in the 








168 


goal box refers to unrewarded time. 
As there seqms to be no conceivable 
way of varying rewarded time while 
holding constant amount of primary 
reinforcement, incentive size, and 
consummatory activity, the question 
of its effect on learning must remain 
purely academic. 


Three groups of thirsty rats were run 
in aU maze. Group 40 and Group 10 
were allowed to drink for 40 and 10 sec., 
respectively, in the goal box and were 
then removed. Group 10-D was al- 
lowed to drink for 10 sec., after which the 
water bottle was withdrawn. Thirty 
seconds later these Ss were removed 
from the goal box. This procedure 
allowed the comparison of two groups 
for whom time in the goal box was the 
same while amount of reward was varied 
(Groups 40 and 10-D), and two groups 
for whom goal-box time was varied while 
amount of reinforcement was held con- 
stant (Groups 10 and 10-D). If Groups 
40 and 10-D were similar in learning and 
extinction, it would show that perform- 
ance is more influenced by time in the 
goal box than by amount of reward, and 
this interpretation would be confirmed 
if Group 10 learned more slowly. If 
Group 40 learned most rapidly while 
Groups 10 and 10-D were slower but 
equal to each other, it would follow that 
time in the goal box had no effect. 
Finally, if the order were 40, 10, and 
10-D, it would show that postreinforce- 
ment delay retards learning. 


On the basis of current learning 
theory it seemed possible to make two 
opposite predictions for the effects of 
postreinforcement delay. On the as- 
sumption that the goal-box stimuli 
would develop secondary reinforcing 
value while Ss were drinking but that 
this should decrease during the delays, 
it should follow that Group 10-D 
would learn more slowly than Group 
10 and also extinguish more rapidly. 
Group 40 should learn most rapidly 
and extinguish most slowly since it 








ELIZABETH FEHRER 


would have the advantage of more 
primary and secondary reinforcement 
in the goal box. 

On the other hand, the delay pro- 
cedure proposed here provides several 
of the conditions that have been 
assumed to be the cause of slow ex- 
tinction under partial reinforcement 
schedules (cf. 8). First, following 
Skinner, (16, pp. 76, 133), one would 
predict that emotional behavior with 
its disruptive effect on performance 
should not be strongly aroused by the 
empty goal box during extinction 
since the delayed Ss should have 
become used to being in the empty 
box during training. Second, learn- 
ing and extinction trials should be 
less discriminable for the delay group 
since a period of no reinforcement 
would occur in both. Therefore the 


Mowrer-Jones discrimination hy- 
pothesis (12) should be relevant. 
Third, following the Hull-Sheffield 


analysis (15), the response of running 
should be conditioned to the after- 
effects of nonreinforcement (delay) 
and therefore the internal cues present 
when S is placed in the starting box 
should be similar during training and 
extinction. On the basis of these 
three considerations, one would pre- 
dict that extinction for Group 10-D 
should be slower than for Group 10. 
Whether the larger reward received 
by Group 40 would outweigh the 
three factors listed above is unclear. 


ExpPerRiIMeEnT | 


Method 


Subjects.—The Ss were 37 male albino rats of 
the Budd Mountain Wistar strain aged 120 to 
150 days. Eleven additional Ss were discarded; 
3 in Group 40, 4 in Group 10, and 4 in Group 
10-D. 

Apparatus.—All sections of the maze were 
4.5 in. wide. The starting compartment, 9 in. 
long, was separated from the 45-in. stem by a 
guillotine door. The stem ted to the base of 
the U which was 31.5 in. long. The length of 





LEARNING AND EXTINCTION 


each upright was 9 in. The tube of a drinking 
bottle could be inserted through a hole at the 
extreme end of the left upright, a position from 
which it could not be seen from the choice point, 
thus eliminating the need for curtains. Two 
doors placed 4.5 in. apart separated the choice 
point from the two sides of the U. All three 
doors were operated by strings. The left half 
of the U was painted white while the right half, 
stem, and starting section were left the natural 
medium-brown color of the wood. The entire 
maze was covered with coarse wire mesh. 

Procedure.—Several weeks previously the Ss 
had been used in an experiment on exploratory 
activity. Pretraining for this consisted of 
placing Ss four at a time on a bare table top for 
15 min. on two successive days. On the fol- 
lowing day, half the Ss were deprived of food 
and on the next day exploratory activity in two 
small boxes was observed over a 15-min. period. 

In pretraining during the present experiment, 
each S was given three trials spaced about 15 
min. apart in the empty maze. The starting 
and choice-point doors were operated during 
these trials. Running times were measured with 
a stop watch from the time the starting section 
door was raised by E until the base of S’s tail 
had passed under one of the doors at the choice 
point. The S was detained for 30 sec. in the goal 
section chosen. On the basis of these trials, Ss 
were divided into three groups approximately 
equal in running time and turning tendencies. 
As approximately two-thirds of the turns were 
to the right side, the water bottle was placed on 
the left side of the maze. 

Two days before the first training trial, the 
water bottles were removed from the cages. On 
the next day they were inserted for | hr. and on 
the following day training was begun with Ss 
under a 23-hr. thirst drive. 

In the training during this experiment, Ss 
were given four trials on Day 1, five trials a day 
on Days 2, 3, and 4, and six trials a day on Days 
5 and 6. The noncorrection method was used. 
The Ss were run in squads of six, the intertrial 
interval varying between 15 and 25 min. on Day 
1 and between 6 and 10 min. on Day 6. On 
each trial S was picked up from its living cage, 
run, and then returned to its cage to await the 
next trial. 

Group 40 (N = 13) and Group 10 (N = 12) 
were allowed to drink for 40 and 10 sec., respec- 
tively, if they chose the correct side before they 
were removed. If they chose the incorrect side, 
they were detained for 40 and 10 sec. Group 
10-D (N = 12), following a correct choice, was 
allowed to drink for 10 sec. Then the tube of 
the water bottle was withdrawn, and 30 sec. 
later S was picked up. Following an incorrect 
choice, these Ss were detained for 40 sec. 


TABLE 1 


Mean Dairy Loc Runninc Times 
in Exe. I 


Group 
10 


Group 


Day Come 


Training 
Day 1 
Day 2 
Day 3 
Day 4 
Day 5 
Day 6 


ae, sw 
1.23 1. 
1.14 1. 
1.04 1. 
O89 
0.78 


0.93 
O.86 


Extinction 
Day 1 


\- 
| 1.34 
Day 2 


1.64 


1.30 
1.65 


If S failed to enter a goal section by the end 
of 180 sec., the trial was terminated and S 
returned to its cage from wherever it happened 
to be. After all Ss had finished their runs for 
that day, the S was given an additional trial so 
that all Ss would have the same number of goal- 
box experiences. Any S failing to pass beyond 
the choice point on any three trials during the 
first two training days was discarded. 

Extinction was carried out on the two fol 
lowing days on each of which six extinction trials 
were run. Any S that had not reached a 
running-time criterion of 180 sec. was given 
additional trials until this criterion was reached 
or until it had run a total of 20 trials 
second day. 

The first 12 extinction trials were spaced in 
the same way as the learning trials, the intertrial 
interval for each S equaling the running times 
of the other five Ss in a group of six. From 
Trial 13 on, the interval was determined by the 
running times of those Ss that had not yet met 
the criterion. During extinction trials all Ss 
were detained for 40 sec. in the chosen goal 
section. 


on the 


Results 


Learning.—During the learning pe- 
riod the three groups did not differ 
significantly in either percentage of 
correct responses or in running time. 
On the 31 training trials the mean 
number of left turns for Group 40 was 
25.23; for Group 10, 25.33; and for 
Group 10-D, 24.17. 

The daily mean log running times 
of the three groups appear in Table 1. 
On those trials in which S did not pass 











170 


beyond the choice point in 180 sec., 
its running time was considered 180. 
The additional trial run on such a day 
was not counted in the daily average. 
The groups did not differ significantly 
on any of the six training days 
(P > .05). Therefore, these data do 
not permit the conclusion that post- 
reinforcement delay retards learning. 
The running times for Groups 40 and 
10 were closely similar and thus do 
not show the usual relation between 
amount of reward and this measure 
of learning. 

Extinction..—Turning tendencies in 
the 12 extinction trials given all Ss 
were not a usable measure of extinc- 
tion since a trial was terminated after 
180 sec., if by then S had not entered 
a goal section. All Ss, however, 
entered a goal section on at least four 
trials after their first entry into the 
previously rewarding section. Of 
these four trials, on the average, 2.54 
were to the left side for Group 40, 
2.92 for Group 10, and 3.08 for Group 
10-D. Although the data tend to 
show that Group 10-D was more 
persistent in running to the formerly 
correct side, none of the three differ- 
ences between groups was significant. 

Running speeds, in contrast, showed 
reliably slower extinction for Group 
10-D than for either of the other two 
and no difference between Groups 40 
and 10 (Table 1). On the first ex- 
tinction day, the ¢ value for the com- 
parison of Group 10-D with Group 40 
was 3.01 (P < .01), for Group 10-D 
with Group 10 it was 3.12 (P < .01), 
and for Group 40 with Group 10 it 
was .27. For the first six trials on the 
second extinction day the comparable 
t values were 1.92 (.10 > P > .05), 
2.19 (P = .05), and .05. 

A third measure of extinction was 
the number of trials required to reach 
various running-time criteria. In 
Table 2 are shown the mean number 


ELIZABETH FEHRER 


TABLE 2 


Mean Number or Triats Requirep to 
Reacuw gacu or Four Extinction 
Carreria iw Expr. I 

















Extinction Group Group Group 
Criterion 40 10 10-D 
& sec 7.62 7.75 12.00 
9O sec. 8.69 10.17 14.83 
120 sec. 11.31 12.67 16.42 
180 sec. 14.38 13.42 19.33 





of trials needed by each group to 
reach running scores of 60, 90, 120, 
and 180 sec. An S which had not met 
a given criterion at the end of 26 
trials was assigned a score of 27. 
Groups 40 and 10 were highly similar 
in the number of trials required to 
meet each of the criteria. Group 
10-D took reliably longer than the 
others to reach the first two criteria, 
and longer but not reliably so to reach 
the last two. 

Both the running times and trials 
to extinction show that Ss with post- 
reinforcement delays extinguished 
later than those rewarded the entire 
time they were in the goal section. 
The difference was greater for the 
early extinction trials. In the later 
trials Ss of Group 10-D showed the 
same sharp increase in running time 
which occurred during the earlier 
trials of Groups 40 and 10. 


ExpeRIMENT II 


The finding that a postreinforce- 
ment delay in the goal box delays 
extinction raised the question of what 
the effects of a prereinforcement delay 
on extinction might be. The Hull- 
Sheffield condition would not apply 
here since during training each run 
would be conditioned to the after- 
effects of drinking whereas during 
extinction runs would foliow the 
aftereffects of no reward. But as 
both the Skinner and Mowrer-Jones 








LEARNING AND EXTINCTION 


conditions would apply, it seemed 
probable that resistance to extinction 
would be greater with delayed reward 
than with immediate and continuous 
reinforcement. 

In spite of all the experimental work 
that has been done on the effect of 
delayed reward on learning, a search 
of the literature yielded only one maze 
or runway study of its effect on ex- 
tinction. Crum, Brown, and Bit- 
terman (2) found marked resistance 
to extinction in a group subjected to 
a random sequence of 0- and 30-sec. 
delays during training. This finding 
fits in with our assumption that a 
constant delay before reward should 
delay extinction. 

No sound basis could be found for 
predicting whether a pre- or a post- 
reinforcement delay group would ex- 
tinguish more rapidly. Faster extinc- 
tion for delayed-reward Ss could be 
predicted on two grounds: first, 
because they should perform less well 
during learning, and second, because 
the Hull-Sheffield condition would not 
apply. On the other hand, the train- 
ing and extinction situations in the 
goal box might be less discriminable 
for these Ss since in both the im- 
mediate consequences of entering the 
goal box would be identical, i.e., no 
reward. For this reason it was felt 
that they might extinguish more 
slowly than the group with postrein- 
forcement delay since for the latter a 
goal-box entry would be immediately 
followed by reward during training 
but by no reward during extinction. 

The purpose of this second experi- 
ment was to compare such delay 
groups with each other and with 
groups that were not delayed in the 
goal box. As errors had not differ- 
entiated the groups run in Exp. I, the 
U maze was converted into a runway 
by placing a block where the door 
leading to the right arm of the U had 
been. All Ss were run by the same E. 


171 
Method 


Subjects.—The Ss were 48 naive male albino 
rats of the Budd Mountain Wistar strain aged 
75 to 9 days at the beginning of the experiment. 

Pretraining procedure.—Several days after 
arrival from the supplier the Ss were placed four 
at a time in a modified Dashiell open-alley maze 
for 30 min. and then returned to their cages from 
which the water bottles had been removed. On 
the next two days, under a 23-hr. thirst drive, 
they were replaced four at a time for 30 min. in 
the Dashiell maze in which two water bottles 
had been set up in different places. All Ss 
discovered the bottles and drank for several 
minutes on each day. On the following day, 
again under a 23-hr. thirst drive, training was 
begun. 

Training procedure.—On the basis of running 
time on the first trial, a rewarded one, the Ss 
were divided into four groups of 12 each. 
Groups 30, 10, and 10-D were treated like the 
comparable groups in Exp. I save that the 
drinking time was cut from 40 to 30 sec. for 
Group 30 and the delay period from 30 to 20 sec. 
for Group 10-D. The Ss of Group D-10 were 
allowed to drink for 10 sec. after they had been 
in the goal section for 20 sec. 

Five trials were given on Day 1 but only the 
the last four were used in computing that day’s 
mean running speeds. On Days 2 to 5 each S 
was given 8 trials. ‘The Ss were run in rotation 
in groups of four. Intertrial intervals averaged 
about 15 min. on the first day and about 5 min. 
on the last day. 

Extinction procedure.—Extinction was car 
ried out on the sixth day. The squads of four 
Ss were run in rotation until 12 trials had been 
completed. Thereafter, in order to speed up 
extinction, each S that had not met the 180-sec. 
extinction criterion was run by massed practice 
until it reached this criterion or had run a total 
of 20 trials. In these massed trials S was put 
back in the starting section immediately after 
being picked up from the goal section. 

Two Ss in the same squad, one from Group 
10 and one from Group 10-D, refused to run on 
Day 2. Subsequently all four Ss of this squad 
were discarded, leaving 11 Ss in each group 


Results 


Learning.-The mean log running 
times for successive blocks of four 
trials during the learning period are 
shown in Table 3. The running 
speeds of the four groups did not 
differ reliably on Days 1, 2, and 3, 
but on Days 4 and 5 Group D-10 











172 


lagged behind the others. Since the 
group variances did not differ sig- 
nificantly on either of these days, it 
was possible to test the reliability of 
the differences among means through 
analysis of variance. For Day 4 the 
F ratio of 3.17 was significant at the 
OS level; and ¢t tests showed that 
Group D-10 ran significantly more 
slowly than Groups 10 (P = .05) and 
10-D (P = 02). On Day 5 the 
differences observed on Day 4 became 
more marked. The F ratio rose to 
4.58 (P = 1); and the comparable 
P values were .02 and .O1. The de- 
layed-reward group, though slower 
than Group 30 on Days 4 and 5, was 
not reliably slower. As in Exp. I, 
none of the differences among Groups 
30, 10 and 10-D was significant, 
although on Days 3 to 5 Group 30 
actually ran more slowly than the 
other two, 


Certain qualitative differences in the behavior 
of the several groups were apparent during the 
training trials. ‘The Ss of Groups 10, 10-D and 
D-10 seemed to drink more vigorously than those 
of Group 30 even though these Ss drank steadily 
throughout their 30-sec. period. This might be 
related to the somewhat faster running times of 
Groups 10 and 10-D as compared with Group 30. 
After the first few trials, Ss of Group 10-D 
usually sat quietly in the bend of the goal section 


TABLE 3 


Mean Dairy Loc Runninc Times puRiInG 
Traininc ann Mean Loc Ruwnine Times 
on Turee Brockxs or Four Triats 
purtmne Extinction i Exp. Il 














Segment Group Grup soe Gop 
Training 

Day | 1.28 1.23 1.31 1.26 

Day 2 1.18 1,00 1.11 1.10 

Day 3 0.87 0.79 0.80 0.93 

Day 4 0.74 0.58 0.62 0.85 

DayS | 0.66 | 0.52 | 0.58 | 0.83 
Extinction 

Block 1 1.21 0.96 0.69 0.98 

Block 2 1.72 1.60 1.04 1.18 

Block 3 1.88 1.72 1.52 1.27 

















ELIZABETH FEHRER 


during the delay period. The Ss of Group D-10 
seemed more nervous and agitated than the 
others, and about half developed “superstitious” 
behavior (17) the nature of which varied from 
Sto S. Two Ss, for example, always groomed 
for a few seconds before entering the goal section. 
Another turned around vigorously a number of 
times in front of the water-bottle hole and 
several others made a number of stereotyped 
runs back and forth in the goal section during 
the delay period. 


Extinction.—The 12 extinction 
trials given all Ss were divided into 
three blocks of four trials. The mean 
log running times of each group during 
each block of trials appear in Table 3. 
The variances did not differ signifi- 
cantly. Analyses of variance indi- 
cated marked inhomogeneity of the 
four group means on each block of 
extinction trials. The F ratios for 
the three successive blocks were 7.28, 
19.30, and 8.46. All are significant 
beyond the .OO1 level. 

Group 30 ran more slowly than 
Group 10 (P = .05) during the first 
four extinction trials. Since the com- 
parable groups in Exp. I.were highly 
similar, the difference here is probably 
a chance one. It is, however, inter- 
esting to note that with Group 30 
slow learning is coupled with fast 
extinction, a finding opposite to that 
for Group D-10. 

Group 10-D showed slower extinc- 
tion throughout the 12 trials than 
either Groups 30 or 10. Five of the 
six comparisons were _ significant 
beyond the .05 level. As in Exp. I, 
the difference between this group and 
the other two was greater for the 
earlier than for the later trials. In 
fact on Trials 9 through 12 the differ- 
ence between Group 10-D and Group 
10 failed to reach the .05 level. 

Although Group D-10 did not differ 
reliably from Groups 30 and 10 during 
the first four extinction trials, the fact 
that they had been running more 
slowly during the learning trials 











indicates less disruption of the re- 
sponse they had learned. On Ex- 
tinction Trials 5 through 12, Group 
D-10 ran significantly faster than the 
two nondelay groups. 

A comparison of the two delay 
groups shows faster running on the 
part of Group 10-D during the first 
eight trials but reliably faster running 
on the part of D-10 during the last 
four. Delayed reward, therefore, 
though it results in less evidence of 
learning, does seem to establish a 
habit that is relatively resistant to 
extinction. The slope of the extinc- 
tion curve of Group D-10 is far less 
steep than that of any other group. 


TABLE 4 
Mean Numper or Trtars Requirep to 


Reacu Eacu or Seven Extinction 
Criteria in Exe. Il 


' 
Extinction Group | Group Group | Group 


Criterion | 30 10 10-D D.10 
45sec. | 3.91 | 482 | 9.45 | 10.55 
6D sec 4.45 645 11.09 15.45 
75 sec 6.18 | 7.91 | 12.36 | 16.36 
sec. | 845 | 9.09 | 14.09 | 17.54 
120 sec. | 9.91 | 13.09 | 16.09 | 19.09 
150 sec 11.73 | 14.73 | 17.45 | 20.36 
180 sec | 12.27 | 14.91 | 18.45 | 20.54 


The mean number of the trial on 
which each group met various run- 
ning-time criteria for extinction is 
shown in Table 4. When ¢ ratios 
were computed for all possible com- 
parisons among groups, none of the 
differences between Groups 30 and 10 
was significant. Group 10-D  re- 
quired significantly more (P < .05) 
trials than Group 30 to reach all 
criteria and significantly more trials 
than Group 10 to reach the 45-, 60-, 
75-, and 90-sec. criteria. Group D-10 
reached each criterion significantly 
later than Groups 30 and 10 and 
reached all but the 45-, 90-, and 120- 
sec. criteria reliably later than Group 


10-D. 





LEARNING AND EXTINCTION 


173 


Actually, the reported values underestimate 
the slow extinction of Group D-10 since a far 
larger proportion of the Ss in this group than of 
any other failed to meet each criterion at the 
end of the 20 extinction trials. Two Ss of 
Group D-10 failed to meet the 45-sec. criterion 
whereas the numbers from Groups 30, 10, and 
10-D were zero. The comparable frequencies 
for the 90-sec. criterion were 5, 0, 0, and O and 
for the 180-sec. criterion, 7,1, 3, and 2. Sincea 
score of 21 was assigned an S failing to reach a 
given criterion by Trial 20, it is obvious that 
extinction for Group D-10 was considerably 
slower than the data in Table 4 indicate. 

Certain qualitative group differences were 
clearly apparent during the extinction trials but 
unfortunately not measured, It is the distinct 
impression of FE that the delay groups made far 
fewer retracings in the stem of the maze than 
Groups 30 and 10 among whom retracings were 
very frequent. 
D-10 seemed to consist mainly of longer waits in 
The Ss of 
Group 10-D retraced somewhat more than those 
of D-10. 

During the early extinction trials, exploratory 
behavior in the goal section was far less extensive 
for Group 10-D than for the nondelay groups. 
In the later trials the nondelay Ss often entered 
the goal section only part way, whereas Ss of the 
two delay groups typically went all the way 
around the bend to the water-bottle hole. Many 
of the Group D-10 Ss persisted in their “super- 
stitious” behavior. 


In fact, the slowing up of Group 


front of the door to the goal section. 


Discussion 


The learning data are in general 
consistent with those of previous studies, 
The slower learning of the delayed 
reward Ss is typical of that reported 
elsewhere (e.g., 13). The fact that 
postreinforcement delay did not affect 
learning criteria agrees with the findings 
of Davis (3). The data confirm Hull's 
assumption that performance is a func 
tion of the speed with which a reward 
follows a response. They imply that 
postreinforcement delay in the goal 
section has no effect on the secondary 
reward value of goal-box cues. 

Although the majority of studies in 
which amount of reward has 
variable have that speed of 
running is positively related to amount 
of reinforcement, the fact that we did 
not find this can be accounted for in 


been a 
shown 





174 


several ways. First, behavior under the 
thirst drive may be affected differently 
by amount of reward than behavior 
motivated by hunger. Second, and 
probably more important, is the fol- 
lowing fact. In previous studies rewards 
‘have been pellets of food of various sizes 
which Ss could eat at their own rate. 
In the present experiments, by contrast, 
Ss were allowed to drink what they could 
from a large reservoir of water before 
either it or they were removed. If this 
led the 10-sec. reward Ss to drink more 
vigorously, as they actually seemed to 
do, their stronger R, could influence the 
vigor of the antecedent running response 
and so counterbalance the effect of the 
smaller primary reinforcement they re- 
ceived. Third, if incentive size affects 
running speed more strongly than 
amount of reinforcement does, one might 
find no difference between groups allowed 
to consume different amounts. Wolfe 
and Kaplon (18) found that chickens 
run more slowly for one kernel of corn 
that for four quarter-kernels which 
probably look like more than one whole 
one. This hypothesis is not necessarily 
invalidated by Guttman’s finding (5) 
that running speed is related to amount 
of reinforcement when incentive size and 
consummatory activity are held con- 
stant, since it is possible that the effect 
of incentive size might dominate that of 
primary reinforcement when both factors 
are varied. 

Results similar to ours might be found 
with food reinforcement if hungry Ss 
were allowed to eat for different periods 
of time from a large supply of food. It 
is true that Jenkins and Clayton’s study 
(7) of the pecking responses of pigeons 
which were periodically rewarded by 
either 2 or 5 sec. of eating from a bin of 
food showed faster pecking rates among 
the 5-sec. feeders. However, the ob- 
served difference might have been even 
greater if incentive size had also been 
varied. 

The fact that the nondelay groups 
were similar in extinction is consistent 
with the fact that they were similar in 
learning. The comparatively slow ex- 


ELIZABETH FEHRER 


tinction of the delay groups requires 
interpretation as does also the fact that 
Group D-10 was even more resistant to 
extinction than Group 10-D. 

The concept of secondary reinforce- 
ment seems of no more value in inter- 
preting our extinction data than it has 
been in accounting for the slow extinction 
following partial reinforcement (8, p. 
227). The concept leads to the pre- 
diction that Groups 10-D and D-10 
should extinguish faster than the non- 
delay groups; whereas the reverse was 
found. 

It has already been pointed out that 
our postreinforcement delay procedure 
provided three of the conditions that 
have been assumed to delay extinction 
after partial reinforcement training 
schedules but that only two of these 
conditions were present for the delayed- 
reward group. Our data may therefore 
provide a test of the relative importance 
of the MHull-Sheffield, Skinner, and 
Mowrer-Jones assumptions which were 
briefly described in the introductory 
section of Exp. I. 

The Hull-Sheffield hypothesis is clearly 
supported by the slow extinction of 
Group 10-D as compared with that of the 
nondelay groups. Yet the fact that 
Group D-10 to which the hypothesis 
cannot be applied extinguished even 
more slowly leads one to question the 
general importance of this assumption 
and to search for principles of wider 
applicability. 

Skinner’s assumption is well supported 
by the quiet goal-box behavior of Group 
10-D during the extinction trials. Ap- 
parently the delay periods during train- 
ing permitted their adaptation to the 
empty goal section. In contrast, the 
nondelay groups showed disturbed 
searching behavior and extinguished 
rapidly. The assumption is indirectly 
supported by the behavior of Group 
D-10 even though these Ss showed 
agitated behavior throughout training. 
Yet, as their behavior during extinction 
was equally agitated, they apparently 
did not develop distinctive emotional 
reactions that might seriously disrupt 





LEARNING AND EXTINCTION 


the running response. This hypothesis, 
however, does not show why Group D-10 
extinguished even more slowly than 
Group 10-D. 

The Mowrer-Jones discrimination hy- 
pothesis seems the most fruitful in 
explaining not only the difference be- 
tween the delay and nondelay groups 
but also that between the two delay 
groups. As the manner of its appli- 
cation has already been described in the 
introductory section of Exp. II, it need 
not be repeated here. 

Because of its flexibility, expectancy 
theory can readily be adapted to explain 
the data. At first glance it does not 
seem to account for the slow extinction 
of Group 10-D. As each training run 
was immediately rewarded, one would 
expect this group to develop as firm an 
expectancy of reward on every trial as 
Groups 30 (40) and 10. If so, it is hard 
to see why Group 10-D did not relinquish 
this expectancy as readily during the 
extinction trials. If, however, Group 
10-D also expected that a period of no 
reward in the goal box would be followed 
by a reward on the next run, the com- 
bination of the two expectancies might 
be sufficient to delay extinction and, 
once the second expectancy had been 
relinquished, to account for the fact that 
during the later trials extinction pro- 
ceeded rapidly. 

Group D-10 probably encountered 
difficulty in forming firm expectancies. 
Those Ss that did not develop “super- 
stitious” behavior may never have 
figured out which of their many responses 
brought reward. Still, since they proba- 
bly did expect that it was something they 
did in the goal section that brought 
water, they should have kept on running 
to it during extinction. If the “super- 
stitious” Ss firmly expected that what 
they did brought water, they should have 
been able to abandon this hypothesis 
fairly readily. As the appropriate pro- 
tocols were not kept, it is impossible to 
know whether the superstitious Ss of 
Group D-10 extinguished more readily 
than the others. Whether or not they 
did, expectancy theory, as interpreted 
here, probably can explain the slow 


175 


extinction of the average member of this 
group. 

It is interesting to note that the non- 
delay Ss who showed much running back 
and forth in the goal section during 
extinction also showed much retracing 
in the maze stem, while the delay groups 
showed both types of behavior to a much 
lesser extent. This might be an illus- 
tration of the elicitation of goal-box 
responses by stimuli from earlier parts 
of the runway in much the same manner 
that Hull (6, pp. 124 ff.) has assumed 
that R,, first elicited by food in the goal 
box, is later conditioned to the cues from 
earlier parts of the maze. This obser- 


vation, if confirmed by careful records of 
behavior, would support an interference 
theory of extinction, since here the 
forward running response is obviously 
interfered with by responses similar to 
those elicited by the empty goal box. 


SUMMARY 


Two experiments were performed to deter- 
mine the effects of amount of reinforcement and 
of pre- and postreinforcement delays in the goal 
box on learning and extinction. 

In Exp. I, three groups of rats under 23-hr. 
water deprivation were given 31 rewarded trials 
ina U maze. ‘Two groups were allowed to drink 
for 40 and 10 sec., respectively, before being 
removed from the goal section. A third group 
drank for 10 sec., after which the water bottle 
was withdrawn, and 30 sec. later these Ss were 
removed. Neither errors nor running speeds 
distinguished the three groups during training, 
but the group with postreinforcement delay 
exhibited reliably greater resistance to extinction 
than the other two. 

Experiment II was essentially a repetition of 
the first with the addition of a delayed reward 
group. A runway replaced the maze. ‘The 
data for the three groups comparable to those 
of Exp. I confirmed the results of that experi- 
ment. The group with postreinforcement delay 
learned more slowly but showed more resistance 
to extinction than any of the other groups. 

\s goal-box delays provide severai of the 
conditions that have been assumed to be 
responsible for slow extinction following partial 
reinforcement, an attempt was made to interpret 
the results on the basis of the same assumptions. 

The absence of running-speed differences 
between high- and low-reward groups was 
explained on the basis of the particular rein- 
forcement procedure used. 





ELIZABETH FEHRER 


REFERENCES 


. Cresri, L. P. Quantitative variation of 


incentive and performance in the white 


Amer. J. Psychol., 1942, 55, 467- 


rat. 
517. 

. Crum, J., Brown, W. L., & Brrrerman, 
M. E. The effect of partial and delayed 
reinforcement on resistance to extinction. 
Amer. J. Psychol., V951, 64, 228-237. 

. Davis, A. D. A test of one aspect of con- 
tiguity theory. J. exp. Psychol., 1954, 
48, 275-277. 

. Guinptey, G. C. Experiments on the 
influence of the amount of reward on 
learning in young chickens. Brit. J. 
Psychol., 1929, 20, 173-180. 

. Gurrman, N. J. Operant conditioning, 
extinction, and periodic reinforcement in 
relation to concentration of sucrose used 
as reinforcing agent. J. exp. Psychol., 
1953, 46, 213-224. 

. Huw, C. L. A behavior system. 
Haven: Yale Univer. Press, 1952. 

. Jenxins, W. O., & Crayton, F. L. Rate of 
responding and amount of reinforcement. 
J. comp. physiol. Psychol., 1949, 42, 
174-181. 

. Jenxins, W. O., 
Partial 
critique. 
234. 

. Lawrence, D. H., & Mirren, N. E. A 
positive relationship between reinforce- 
ment and resistance to extinction pro- 
duced by removing a source of confusion 
from a technique that had produced 
opposite results. J. exp. Psychol., 1947, 
37, 494-509, 


New 


& Sraniey, J. C., Jr. 
reinforcement: a review and 
Psychol. Bull., 1950, 47, 193- 


10. Lawson, R. J. 


. Mowrer, O. H., & 


. Pert, C. T. 


. Revnoups, B. 


. Suervrievp, V. F. 


. Sxowner, B. F. 


. Skinner, B. F. 


. Worre, J. B., & Karon, M. D. 


. Zeaman, D. 


Amount of primary reward 
and strength of secondary reward. /. 
exp. Psychol., 1953, 46, 183-187. 

Mauer, W. B., & Wickens, D. D. Effect 
of differential quantity of reward on 
acquisition and performance of a maze 
habit. J. comp. physiol. Psychol., 1954, 
47, 44-46. 

Jones, H. Habit 

strength as a function of the pattern of 

reinforcement. J. exp. Psychol., 1945, 

35, 293-311. 

A quantitative investigation 

of the delay-of-reinforcement gradient 

J. exp. Psychol., 1943, 32, 37-51. 

Acquisition of a simple 

spatial discrimination as a function of the 

amount of reinforcement. /. exp. Psy- 

chol., 1950, 40, 152-100. 


Extinction as a function 
of partial reinforcement and distribution 
of practice. J. exp. Psychol., 1949, 39, 
511-526. 

The behavior of organisms. 
New York: Appleton-Century, 1938. 
“Superstition” in the 
pigeon. J. exp. Psychol., 1948, 38, 168- 
172 

Effect of 
amount of reward and consummative 
activity on learning in chickens. /. 
comp. Psychol., 1941, 31, 353-361. 
Response latency as a function 
of the amount of reinforcement. /. exp. 
Psychol., 1949, 39, 466-483. 


(Received August 23, 1955) 





Journal of Experimental Psychology 
Vol. 52, No. 3, 1956 


REWARD PROBABILITY, AMOUNT, AND INFORMATION AS 
DETERMINERS OF SEQUENTIAL TWO-ALTERNATIVE 
DECISIONS ! 


WARD EDWARDS 


Operator Laboratory, Air Force Personnel and Training Research Center 
Lackland AFB, San Antonio, Texas 


The development of stochastic 
learning theories (2, 6, 7) has stimu- 
lated study of situations in which Ss 
make repeated choices between two 
alternatives when the probability of 
being rewarded for choosing each 
alternative is greater than zero but 
less than one. Such experiments, 
often called probability learning ex- 
periments, are at present the best 
means of testing the stochastic learn- 
ing theories. They are also inter- 
esting for their own sake. They are 
examples of the very important and 
common kind of decision-making in 
which the decider must learn about 
the situation by observing the result 
of his decisions (4, 5). 

This paper reports two such experi- 
ments, both using real money rewards. 
In one the major independent variable 
is probability of reward; in the other 
both probability and amount of 
reward are used as independent vari- 
ables. The results support two new 
static theories about asymptotic 
probabilities of choice, which predict 
their rank order with good accuracy 
and their amount with fair accuracy. 
They also raise doubts about the 


This research was supported by Contract 
NSori-166, Task Order 1, between the U. S. 
Office of Naval Research and The Johns Hopkins 
University. This is Report No. 166-1-199, 
Project Designation No. NR 145-089 under that 
Contract. The experimental work and most of 
the data analysis were done while the author 
was at The Johns Hopkins University. I am 
grateful to Miss Frances Brown, Miss Suzanne 
Wollenberg, Mr. Klaus Rennert, and Mrs. 
Harriott Quin for running Ss and analyzing data. 


familiar generalization which says 


that in probability learning situations 
the asymptotic probability of choice 
is equal to the probability of reward. 


MetTuop 


Apparatus.—The apparatus used in the two 
experiments reported here was a slot machine. 
The S sat in front of an L-shaped console. At 
the top of the console was a red light, the “chip- 
demanding light.” Immediately below it was a 
slot, into which at the beginning of each trial S 
inserted chips until the chip-demanding light 
wentout. When it did, another red light located 
below the slot, the “enough-chips light,” turned 
on. Below the enough-chips light were two 
white lights, side by side, called the “informing 
lights.” Below each informing light was a 
button, set in an offset so as to be at an angle of 
45°to S. Below the two buttons was a well into 
which the machine could deliver chips, one at a 
time. No S was permitted to use paper or 
pence il. 

The 


completely controlled by a programmer con- 


functioning of the slot machine was 
The program could be 
prepared to require any number of chips before 
went off and the 
When the enough- 
chips light went on, S pressed one of the two 
buttons. 

happened.) 


cealed in another room. 


the chip-demanding light 
enough-chips light went on 


(If he pressed both buttons, nothing 

The program could be prepared so 
that either, both, or neither of the buttons was 
correct, or so that one button paid off one 
number of chips and the other paid off a different 
number. The informing lights could be used or 
not used; if they were used, each light flashed 
once (after S had pressed either button) for 
each chip that S could have won by pressing the 
button under it, regardless of which button he 
did press. Thus they served to inform S of 
what the effect of pressing the button he didn’t 
press would have been. All variables 
could be changed completely with each new trial 

All responses and outcomes were automati- 
cally recorded on a much-modified Fterline- 
Angus operations recorder. 


these 


177 





178 


Subjects.—Both experiments used randomly 
chosen Johns Hopkins male undergraduates as 
Ss. Names were selected at random from a 
directory of students and a letter of invitation 
was sent to each potential S. The refusal rate 
was high enough so that those who accepted 
cannot be regarded as a random sample of the 
Hopkins undergraduate population. They were, 
presumably, a random sample of those Hopkins 
undergraduates who were willing to participate 
in prolonged gambling experiments. 

Each experimental group consisted of six 
arbitrarily assigned Ss; there were four groups 
in the first experiment and three in the second, 

General procedure.—Each S, on arrival, was 
shown the slot machine and told how to operate 
it. All Ss were told that it was programmed, 
but they were also told that the program was 
prepared in advance, so that the choice on one 
trial had no effect on the outcome of the next 
trial. Each S was instructed that his sole 
purpose in the experiment was to make as much 
money as possible. He received a dollar’s worth 
of chips at the beginning of each 150-response 
session. At the end of the session, he exchanged 
his chips for money. If (as happened very 
rarely) he lost all his chips during the course of a 
session, he bought more, using his own funds for 
the purpose, 

After all experimental sessions were over, S 
participated in one or two extra sessions, in 
which he won or lost enough money to bring his 
total winnings over the whole experiment to 
approximately $1 per hour. 


ExpeRIMENT | 


Purpose.—This experiment was de- 
signed to find out what Ss do in such 


WARD EDWARDS 


gambling situations as a function of 
probability of reward and of reward 
information. 


Program.—There were four groups in this 
experiment, defined by two major variations in 
conditions under which reward was presented. 
The informing lights could be used (symbolized 
L for light) or not used (D for dark). Also, the 
program could be prepared so that one or the 
other of the two buttons always paid off (O for 
opposed), or the buttons could be programmed 
independently of each other, so that on a par- 
ticular trial neither or both might be correct (1 
for independent). 

Each group was run for nine days, 150 trials 
per day. On Day 1, for all Ss, both buttons 
were correct on half the trials and neither was 
correct on the other half; the informing lights 
were, of course, off. This day was intended to 
check for position preferences. None were 
found. On Days 2 through 8, the treatments 
OL, OD, IL, and ID were applied to the four 
groups, each group getting the same treatment 
on all days. On Days 2, 4, 6, and 8, the proba- 
bility of reward was .5 for each button. These 
days were control days. On Day 3 the proba- 
bility of being rewarded on the left button was 
4; 0n Day 5, .3; and on Day 7, .8. For Days 
1 through 8, the probability of being rewarded 
on the two buttons added up to 1. On Day 9, 
the informing lights were off for all Ss, the left 
button never paid off, and the right button paid 
off on half of the trials. Days 3, 5,7, and 9 will 
be called the experimental days. 

The assumption underlying the design is that 
the 150 responses on each control day provided 
an opportunity for each S to return to a neutral 
condition, in which his probability of choosing 





° 


o 
=z 


°o 
a 


° 
a 


° 
LS 


pave =| 


PROBABILITY OF PUSHING THE LEFT BUTTON 


Oars 





} 

| 

| 

| 

’ 
vane Daya =| 
| 
- 2 Se oe 6 fe 8 8 Ak 8 Ae ee ee 8 Oe ee 8 Ae 8 8 ee 8 8 


| 
| pave | oar? DAY 6 DaY9 








° 
23 75 125 23 7% 25 2 76 @ 8 7S 


25 2 75 


125 25 75 125 25 75 125 2 75 


25-RESPONSE BLOCKS OF TRIALS 


Fie. 1. 
block is named by its upper end point. 


Probability of choosing the left button for all Ss in Exp. IL. 


Note that each 25-trial 


The horizontal line for each day is the probability of being 
rewarded for choosing the left button on that day. 





SEQUENTIAL TWO-ALTERNATIVE DECISIONS 


each alternative was about .5. Consequently 
at the beginning of each experimental day S 
essentially started the experiment anew. The 
data support this assumption, which is of course 
a much-watered-down version of the path 
independence assumption of stochastic learning 
models. 

On all days of Exp. I, each chip was worth 5 
cents, S had to ante one chip at the beginning 
of each trial, and if he won, he won two chips. 


Results.—F igure 1 shows the proba- 
bility of pressing the left button by 
25-trial blocks for all of Exp. I. 
Analysis of variance showed that the 
four groups, OL, OD, IL, and ID, 
were not significantly different from 
one another on any day, so they are 
combined here. The only possible 
exception is that on Days 7 and 9 
Group IL seemed to be a little more 
extreme than the other groups—but 
this difference was insignificant. 

The asymptotic probability of 
choice in experiments like these has 
been extensively studied. Grant, 
Hake, and Hornseth (13) originated 


the generalization which says that 
the asymptotic probability of choice 
will be equal to the probability of 


reward; a number of experiments 
have been interpreted as confirming 
this generalization. Hereafter this 
paper will call it the probability 
matching generalization. The cases 
of both the and the Bush- 
Mosteller models which are usually 
applied to such situations predict it. 
Inspection of Fig. 1 shows that the 
results on Days 3, 5, and 7, the only 
days for which the probability match- 
ing generalization predicts asymptotic 
probabilities of choice other than .5, 
fail to support that generalization. 
On all three days the asymptotic 
probability of choice is more extreme 
than the probability of reward. 

In experiments of this sort the only 
information available to S when he 
makes a choice is the nature and 
results of previous choices. It is 
sensible to assume, therefore, that 


Estes 


179 


S’s later choices will depend on his 
earlier choices and their outcomes; 
that is, that there will be sequential 
dependencies among successive re- 
sponses. The measure best suited 
to examining most such sequential 
dependencies is the information meas- 
ure. Fortunately, Fig. 1 suggests 
that most of the change in frequency 
of choice as a result of a change in 
probability of reward takes place 
during about the first 50 trials. This 
convenient fact permits analysis of 
the last 100 responses on each day by 
information-theory techniques. 
McGill (16, 17) has developed a 
method for analysis of transmitted 
information which is closely analogous 
to analysis of variance. As used here, 
the method assumes that the in- 
formation in a response may be attrib- 
uted to individual differences (Ss), 
the previous response, whether or not 
the previous response was rewarded, 
and the interactions of these 
variables. (Probabilities of reward 
were not included because the analysis 
was done separately for each day.) 
The amounts of information trans- 
mitted by each of these variables was 
significant when tested by the Miller- 
Madow test (18). The interactions 
cannot be directly tested for signifi- 
cance by that but various 
indirect tests suggest that the inter- 
action between the previous response 
and its reward was significant, and 
that the other interactions were not.* 


three 


test, 


2QOne of the difficulties 
analysis of 
interactions 


kind of 
that the 
(Contrary to 


with this 
nonorthogonal data is 
may be negative 
most impressions, the same is true for analysis 
The 
presence of negative interactions greatly com 
plicates interpretation of both interactions and 


simple transmission components 


of variance of nonorthogonal data.) 


Fortunately, 
those interactions which were negative in this 
experiment were too small to cause any such 
difficulties 
negative interactions could be found. Garner 
and McGill (10) give a full discussion of this 
problem. 


No patterning in the occurrence of 





s 





GROUP IL 
—-— GROUP ID 
——— GROUP OL 
@rcece GROUP OD 


PERCENTAGE OF RESPONSE INFORMATION 
LEFT UNEXPLAINED 











A. 1 
4 5 6 
OAYS 


Fic. 2. Percentage of total information in 
the last 100 trials of each day of Exp. I which is 
not accounted for by Ss, the immediately pre- 
ceding response, its reward or nonreward, and 
interactions of these variables. 


Although these various sources of 
transmitted information were signifi- 
cant, this does not necessarily mean 
that they were important. Figure 2 
shows the percentage of information 
in the responses which was not trans- 
mitted by any of the above three 
variables or their interactions. It is 
evident that the three variables 
account for only about 16% of the 
total response information for three 
of the four groups, and only about 
30% for Group IL. (Once again we 
have some reason to suspect that 
Group IL is different from the other 
three; we will return to this question 
in Exp. II and in the discussion 
section.) Examination of Fig. 2 also 
suggests that the percentage of un- 
explained information is usually less 
for experimental days than for control 
days. This is probably an artifact 
resulting from the fact that the total 
amount of response information is 
less (i.e., the probability of choice of 


WARD EDWARDS 


the two alternatives deviates more 
from .5 each) on experimental days 
than on control days. 


Where does all that unexplained 
information (which corresponds more or 
less to unexplained variance in a com- 
ponents-of-variance analysis) come from? 
Not from previous responses and their 
rewards or nonrewards. By pooling over 
Ss, it was possible to perform an analysis 
which included the previous two re- 
sponses and their rewards or nonrewards. 
Addition of the extra previous response 
and reward produced no significant de- 
crease in unexplained information. A 
theory about the source of unexplained 
response information will be presented 
in the discussion section. 


Experiment II 


Purpose.—This experiment was in- 
tended to examine the effect of 
amount of reward and its interaction 
with probability of reward. Specifi- 
cally, it was designed to find out to 
what extent an increase of amount of 
reward can compensate for a decrease 
in probability of reward, and whether 
the effects of such changes can be 
predicted from the obvious and sim- 
plest assumption, which is simply that 
the expected amounts of reward 
should be critical in determining 
choices. 


Program.—These were eight days and three 
groups (OL, OD, and IL) in this experiment. 
Group ID was omitted because all results of 
Exp. I, plus a line of reasoning to be presented 
later, indicated that it was essentially identical 
with Group OD. As in Exp. I, Day 1 of this 
experiment was a check for position preferences, 
and none were found. The experimental ar- 
rangements for Days 2 through 8 are presented 
in Table 1. Each member of each experimental 
group received the treatment appropriate to his 
group on Days 2 through 8. The required ante 
of chips was two chips rather than one, but the 
value of each chip was only | cent, instead of 5 
cents, This experiment used no control days 
because its main point concerns the difference 
made by varying amount of reward in otherwise 
identical situations, and this difference is most 





SEQUENTIAL TWO-ALTERNATIVE DECISIONS 


TABLE 1 


ExrerimentAL Desicn ror Exp. Il 


P: Win Payoff 
on Left on Left 





P: Win 
on Right 


Payoff 
on Right 
12 chips 

4 chips 
12 chips 

4 chips 
12 chips 

4 chips 
12 chips 














4 chips 


clearly visible when two adjacent days differ only 
in amount of reward. The rate of attaining the 
asymptotic probability of choice was probably 
affected by this procedure, but it is believed that 
the value of that asymptote and the sequential 
dependencies in the data would have been the 
same if experimental days had been separated 
by control days or if some form of random or 
orthogonal order of presenting conditions had 
been used. Once again, this assumption is a 
much-watered-down version of the path inde- 
pendence assumption of stochastic learning 
theory. 


Results.—Figure 3 shows the proba- 
bility of pressing the left button by 
25-trial blocks for all days except 
Day 1 of Exp. Il. The three groups 


181 


are notcombined. ‘The reason is that 
analysis of variance showed the inter- 
action between groups and amounts 
of reward to be significant. (Inci- 
dentally, that analysis also showed 
that the interaction of probabilities 
and amounts of reward was signifi- 
cant, and that both probabilities and 
amounts of reward were significant 
relative to that interaction.) Ex- 
amination of Fig. 3 makes a number 
of things clear. First, both proba- 
bility and amount of reward are 
important in determining choice. It 
is possible to compensate for changes 
in probability of reward by reverse 
changes in amount of reward. A 
rough generalization, comparable in 
accuracy to the probability matching 
generalization and with far more 
predictive power, is that the asymp- 
totic probability of choice of an 
alternative will be approximately 
equal to the ratio of the expected 
value of that alternative to the sum 
of the expected values of the two 
alternatives. This will hereafter be 





GRovuP IL 
GROUP OL . 
GRovuP OD. 











PROBABILITY OF PUSHING THE LEFT BUTTON 








J. 


DAYS ale AYE DAY? Oaye 


; 











ae wewirewsawu wee 








125 25 75 


25 2 7 


d. 
125 25 75 i25 25 75 125 2 75 128 


25-RESPONSE BLOCKS OF TRIALS 


Fic. 3. 


trial block is named by its upper end point. 


Probability of choosing the left button for each group of Exp. II. 


Note that each 25- 


As in Fig. 1, the solid horizontal line for each day is the 
probability of being rewarded for choosing the left button on that day. 


The dashed horizontal line 


is the ratio of the expected value of the left button to the sum of the expected values of both buttons. 
On equal-reward days, this line coincides with the solid horizontal line. 





182 


called the EV-matching generalization. 
Of course in the special case in which 
the sizes of reward are the same for 
the two alternatives and the proba- 
bilities of reward add up to one, this 
generalization reduces to the proba- 
bility matching generalization. 

On Days 5 and 7 of this experiment, 
as in Exp. I, the actually observed 
asymptotic probabilities of choice are 
more extreme than the probability 
matching generalization or the EV- 
matching generalization would 
predict. 

‘Multivariate information  trans- 
mission analysis was applied to the 
data of Exp. II. The results were 
similar in almost all respects to those 
of Exp. 1. The previous response, its 
reward or nonreward, and Ss trans- 
mitted significant amounts of in- 
formation to the present response, 
and there was some reason to believe 
that the interaction of previous re- 
sponse with its reward or nonreward 
would have been significant, if it 
could have been tested for signifi- 
cance. About 75% of the total 
response information was not ac- 
counted for by these variables. No 
substantial group differences in 
amount of explained information were 
found in this experiment, in contrast 
with the results for Exp. 1. 


Discussion 


The RELM rule—The asymptotic 
probabilities of choice found in both 
experiments can be accounted for, in a 
general way, by means of a concept 
borrowed from statistical decision func- 
tion theory. L. J. Savage, who probably 
invented the concept,’ calls it Joss. The 


* The first published mention of this concept 
appears in a book review by Savage (19). He 
there attributes the idea to Wald, although he 
acknowledges that Wald never stated it ex- 
plicitly. In a later publication, Savage says 
that Wald “took the position that loss... 


WARD EDWARDS 


loss involved in a risky decision is 
defined as the difference between the 
amount of payoff actually obtained and 
the amount which could have been ob- 
tained if the decider knew the true state 
of the world (in this case, how much 
payoff would result from pushing each 
button) before making his decision. 
Thus, if the left botton will pay off 4 
chips this trial and the right button will 
pay off 12, then the loss associated with 
choice of the left button is 8 and the loss 
associated with choice of the right button 
is 0. It is possible to calculate the ex- 
pected loss for each experimental ar- 
rangement used in Exp. I and II. Such 
a calculation is illustrated in Table 2. 
Instead of using this concept in the 
all-or-nothing manner of statistical de- 
cision theory, it seems appropriate to 
make two modifications in it. The first 
introduces relative rather than absolute 
amounts of loss; the second produces 
probabilistic rather than all-or-none pre- 
dictions. The decision rule to be tested 
against the data, then, goes like this: 
If EL stands for expected loss, then the 
AEL 
EL’ 
will § be to choose the alternative with 
the smaller EZ. For the case in which 
there are two alternatives, 4 and B, 
this means that the greater the value of 
ELp — ELs 
(EL, + ELp)/2’ 
S be to choose 4. 


greater the value of the more likely 


the more likely should 


If the total quantity 
is positive, he should choose 4 more often 
than B; if negative, he should choose B 


more often than A. Hereafter this 
decision rule will be called the Re/ative 


[was an ] invention of mine toward which he was 
tentatively sympathetic” (20, p. 170). Savage 
uses the notion of loss in connection with the 
minimax criterion; Wald always stated the 
minimax criterion as applying to negative in- 
come; but Wald’s examples are all of a type in 
which the two concepts are mathematically 
identical. Matters are further complicated by 
the fact that all writers in this field other than 
Wald and Savage reserve the term loss for the 
concept of negative income, and call Savage's 
concept regret. Since Savage seems to have 
originated the concept, his language is followed 
here. 





SEQUENTIAL TWO-ALTERNATIVE DECISIONS 


TABLE 2 


ILuustTRative CALCULATION OF 


Payoff Matrix 


State of the World 


Choice 


Right | 


+2 
+10 


+ 2 “ | 


= 2 
« 


-2 
+10 | —2 


| 


The probability of being rewarded on L is .7; on R is .3; the two buttons are Independent 
8 (.7) (3) +007) C7) + 12 C3)C3) 
0 (7) C3) +467) 67) +003) 05) +003) 7) = 1.96 


If the opposed payoff condition were used, the states of the world represented by the first and fourth 
and loss matrices would have a probability of 0 (i.e., would never occur) 


Expected loss for left choice 

Expected loss for right choice: 
Note. 

columns of the payo 


L4n4 | L4R- | Les /L-R- 


Exprectep Loss ror Grour IL, Day 4, Exe. II 


Lows Matrix 


Choice 


| 
State of the World 
} 


L-—R+ 


| Left 12 | Oo 
Right | 0 0 0 


Therefore 
+ 0 (3) (.7) = 2.76 


The second and 


third columns would have probabilities .7 and .3; and the expected losses would be 3.6 for the left choice and 2.8 


for the right choice. 


SEL 
their sum, which enters into the EL 


Expected Loss 
RELM rule. 

The RELM rule predicts a complete 
ordering of probabilities of choice for all 
experimental arrangements which differ 
from one another in amount or proba- 
bility of reward, or in whether the reward 
schedules for the two buttons are opposed 
orindependent. This prediction extends 
across both experiments. (The RELM 
rule also predicts which alternative 
should be more frequently chosen; all 
such predictions were correct.) There 
are a total of 24 different experimental 
arrangements in these two experiments 
taken together, ignoring Days 1 of both 
experiments and Days 4, 6, and 8 of Exp. 
I. The product-moment correlation be- 
tween relative expected loss and proba- 
bility of choosing the left button on the 
last 100 trials of each day is .96. To put 
it another way, the RELM rule predicts 
a total of 267 inequalities in probability 
of choice (the other nine predictions are 
for ties), and 252, or 94%, of these 
predictions are correct. 

Although the logic of the RELM rule 
permits only ordinal predictions, the 
correlation between relative expected 
loss and probability of choice is so high 
that it is interesting to examine the pre- 
diction equation. The scatter plot 
makes it clear that the relation is linear. 
Since the RELM rule requires that 
probability of choice of each alternative 


Minimization rule, or 


The difference between the two alternatives is the same for the two payoff conditions, but 


calculation presented in the text, is different 


be .5 whenever the difference in expected 
loss between the two alternatives is zero, 
the equation must be of the form: 


AFL 
=t668 
pw 05+ K-r: 


where p is the probability of choosing the 
left button and K isa constant. A one- 
parameter least-squares fit shows that 
K = .2650. 

Why does the RELM rule work? It 
embodies three tendencies which were 
found in the data; its merit over a verbal 
statement of each tendency separately 
is that it specifies quantitative relation 
ships among them. One of these tend 
encies is that high probabilities of reward 
are preferred tolow. The second is that 
large amounts of reward are preferred to 
small. The third is that for IL groups 
the difference between the two buttons 
is much more pronounced, if it exists at 
all, than for the other groups. In a 
simple case, this point is obvious. Con 
sider Day 5 of Exp. II in which the 
probability of reward was .7 on the left 
button and .3 on the right. For Group 
IL, on 42 trials out of 100 it didn’t 
matter which button was pressed, since 
the buttons were either both right or 
both wrong. On the remaining 58 trials 
out of 100 on which the choice did make 
a difference, the left button paid off 49 
times, or 84.5% of the time. So the 
effective probability differential in favor 


(1) 





184 


of the left button was substantially better 
than the .7 probability of reward would 
lead you to believe. The same kind of 
point, but with more complicated num- 
bers, can be made for the different-size- 
of reward situations of Exp. II. The 
RELM rule embodies this mathematical 
property of the independent reward ar- 
rangement for all possible combinations 
of probability and amount of reward. 

The same argument ought to apply to 
Group ID of Exp. 1. The reason why it 
does not is that no § in Group ID can 
ever know that on some trials his choice 
is irrelevant. If he chooses the left but- 
ton, it is irrelevant for him whether only 
the left button or either button would 
have paid of. That is why Group IL is 
often unique in Fig. 2 and 3. Because 
of this, the data from Groups OL, OD, 
and ID were averaged and used as the 
opposed-condition data in comparing 
observed orderings with those predicted 
by the RELM rule. 

Group OL has more information than 
Groups OD and ID; why isn’t it different 
from them? Both data and retro- 
spective reports show that unless Ss have 


positive information to the contrary they 
assume that whenever one alternative is 


WARD EDWARDS 


wrong, the other must have been correct. 
So the extra information available to Ss 
in Group OL only confirmed what they 
took for granted anyhow. 

The extreme-asymptote generalization.— 
In any experiment involving a sub- 
stantial number of different conditions, 
the RELM rule is a much more satis- 
factory rule of thumb for predicting 
asymptotic probabilities of choice than 
the EV-matching generalization. How- 
ever, unless the relation between relative 
expected loss and asymptotic probability 
of choice is exactly known, the RELM 
rule predicts only rank orderings. 

Both experiments in this paper indi- 
cate that the EV-matching generali- 
zation is not very satisfactory. They 
suggest a better generalization for the 
case of equal amounts of reward and 
probabilities of reward which add up to 
one: the asymptotic probability of 
choice is more extreme than the proba- 
bility of reward, and as the difference 
between the probability of reward and 
-5 increases, the difference between the 
asymptotic probability of choice and the 
probability of reward also increases 
until the asymptotic probability of 
choice becomes 1 or 0, and so can change 


TABLE}S 


Most Extreme, Finat, ann Estimateo Asymptotic Prositities or Cuorce Founp 1x Previous 
Two-Avrernative Exrertmments ror Wuicn THe Propasitiry Matcuinc GeneRALizaTIoNn 
ano THE Extrreme—-Asymptote Generavization Give Dirrerent Prepictions 





Experimenter 


Final Estimated 





Grant, Hake, & Hornseth (13) 





Jarvik (15) 





Hake & Hyman (14) 





Burke, Estes, & Hellyer (1) 





Estes & Straughan (8) 





Goodnow (11) 























SEQUENTIAL TWO-ALTERNATIVE DECISIONS 


no more. From here on, this principle 
will be called the extreme-asymptote 
generalization. A_ stochastic learning 
model which predicts it has been de- 
veloped (5). 

Which of these conflicting rules of 
thumb is more nearly correct? To find 
out, estimates of asymptotic probabilities 
of choice are needed. Table 3 contains 
the most extreme probability of choice 
(an estimate of the asymptotic proba- 
bility of choice which is biased in favor 
of the extreme-asymptote  generali- 
zation), the last probability of choice 
(biased in the other direction), and an 
estimate of the asymptote (subjective, 
but as unbiased as I can make it), for 
all published relevant experiments that 
I know. It seems clear that almost 
all available data support the extreme- 
asymptote generalization rather than the 
EV-matching generalization. It may be 
noted, however, that most of the num- 
bers in Table 3 are based on readings 
from graphs, and so are subject to small 
reading errors which might be biased. 

The relation between the extreme- 
asymptote generalization and the RELM 
rule is very simple. If Equation 1 gives 
the correct relation between relative 
expected loss and asymptotic probability 
of choice, then the extreme-asymptote 
generalization for opposed conditions 
(with amount of reward the same on 
both sides and probabilities of reward 
which add up to one) can be stated as 
follows: K in Equation 1 must be 
greater than }. No similarly simple 
statement for independent conditions 
exists, but the RELM rule always 
predicts a more extreme asymptote for 
independent than for comparable op- 
posed conditions, so if K is greater than 
4, the extreme-asymptote generalization 
will be true for independent conditions 
also. The value of K found in the 
experiments reported here meets this 
requirement, even though it is based on 
the last 100 trials of each day, rather than 
on some estimate of the asymptotes. 

Ratoosh (personal communication) 
has performed unpublished experiments 
on both rats and human beings in which 
he found that for probabilities of reward 


185 


greater than about .7, the asymptotic 
probability of choice is 1, and D. Meyer 
(personal communication) has found the 
same thing for monkeys. It is believed 
that these data, taken in conjunction 
with those reported here and with many 
other data, suggest that a principle 
somewhat like the Yerkes-Dodson Law 
operates in probability learning. If the 
reward involved is trivial, § will not 
care very much whether he is right or 
wrong, the value of K in the RELM rule 
will be small, and asymptotic proba- 
bilities of choice will be relatively far 
from 0 and 1, perhaps as far as the 
probability matching generalization pre- 
dicts. As the amount of reward (or the 
intensity with which the reward is 
desired) increases, the value of K will 
increase, and asymptotic probabilities of 
choice will be relatively near 0 and 1, 
If the amount of reward or strength of 
motivation for it is extreme, then § will 
be distressed whenever a_ consistent 
strategy produces an unrewarded choice, 
the relatively “rational” behavior found 
in the middle range of motivation will be 
disrupted, and asymptotic probabilities 
of choice will again be relatively far from 
0 and 1—though never as far as when § 
doesn’t really care. This theory is easy 
to test. It is, of course, entirely con- 
sistent with the RELM rule. This 
suggestion is related to one made by 
Grant, Hake, and Hornseth (13). 


Path independence.—The finding of no 
significant sequential dependencies far- 
ther back than the previous response 
supports the assumption of path inde- 
pendence made by stochastic learning 


theories. That assumption is that the 
parameters of the model plus the present 
probabilities of response are enough to 
predict future behavior as accurately as 
it can be predicted. 

This experiment was designed around 
a more general kind of path independence 
assumption, which says that asymptotic 
probabilities of choice and sequential 
dependencies in the last 100 responses of 
a given day were independent of events 
on previous days (except for familiari- 
zation effects which were assumed to be 





186 


complete in Day 1, data from which 
were discarded). All relevant com- 
parisons (e.g., between Days 5 and 7 of 
the two experiments) support this as- 
sumption, and further indicate that the 
only effect of a previous day is on the 
initial probabilities of choice for the 
subsequent day—exactly as the sto- 
chastic learning models would predict. 
A theory about individual choices.— 
The extreme-asymptote generalization 
and the RELM rule are interesting and 
helpful, but they constitute far from a 
complete account of what goes on in 
experiments like this. The most im- 
portant fact about such experiments is 
that Ss change their patterns of choice 
in a way which looks more or less gradual 
when you look at averaged data, but 
which often looks sudden and discon- 
tinuous when you look at individual 
records. These changes are lawfully 
related to probability and amount of 
reward. A convincing account of how 
they take place is lacking. The sto- 
chastic learning theories present ac- 
counts of this sort, but they do not 
correspond very closely to the behavior 
of individual Ss. They imply a regu- 
larity and smoothness of change which 
is not usually observed in individual 
records. This question about the degree 
to which average curves misrepresent 
individual data has been insufficiently 
discussed in the literature; it is possible 
to extract from some papers the incorrect 
impression that in general there is 
relatively little individual variation in 
probability learning situations. In both 
of the experiments reported in this paper 
the opposite was true; individual differ- 
ences were large. The same is true of the 
Detambel (3), Flood (9), Goodnow (11), 
Goodnow and Postman (12), Grant, 
Hake, and Hornseth (13), and Hake and 
Hyman (14) experiments, and of un- 
published experiments by R. R. Bush 
and his students, H. W. Hake, R. 
Hyman, and P. Ratoosh. These state- 
ments are based on personal communi- 
cations on this subject from Bush, 
Detambel, Flood, Goodnow, Hake, 
Hyman, and Ratoosh. Bush adds that 
one of the main difficulties with the 


WARD EDWARDS 


Bush-Mosteller learning model is that 
the individual differences found in proba- 
bility learning situations seem to be so 
much larger than the model predicts. 

Observation of Ss, examination of 
individual records, and _ retrospective 
reports by Ss all suggest that Ss make 
their choices on the basis of hypotheses 
about what the reward sequence is. 
Two kinds of such hypotheses can be 
distinguished. The first is what might 
be called big hypotheses: e.g., “The left 
button pays off far more often than the 
right one.” The main use of the big 
hypotheses is to rule out certain classes 
of small hypotheses and to make other 
classes of small hypotheses more proba- 
ble. For example, a small hypothesis 
consistent with the big hypothesis used 
as an example above would be “The 
right button never pays off oftener than 
once every three times or less often than 
once every five times; so I should press 
the left button three times in succession 
and, if I win all three times, shift to the 
right button until it pays off once, and 
then shift back to the left button.” 
Almost every decision is made on the 
basis of a small hypothesis. Since no 
small hypothesis can possibly be correct, 
Ss change their small hypotheses at 
frequent intervals, as the evidence 
against each one accumulates. 

The purpose of the information analy- 
ses presented in this paper, of course, 
was to test for the occurrence of se- 
quential hypotheses such as those men- 
tioned above. The fact that only about 
25% of response information is accounted 
for by individual differences, previous 
responses, and their rewards or non- 
rewards seems at first sight to be evi- 
dence against this picture of the decision 
process. From the point of view of 
information analysis such unexplained 
information must be dismissed: as 
“noise.” Its real origin, however, proba- 
bly is not in random behavior, but rather 
in the fact that the information analysis 
must be performed on _ probabilities 
estimated from a large sample of re- 
sponses—in this case, 100 per day. If 
Ss change their small hypotheses many 
times in the course of 100 responses, then 











SEQUENTIAL TWO-ALTERNATIVE DECISIONS 


only common features of these different 
hypotheses (namely, those features 
which I have called big hypotheses) 
will emerge in the information analysis; 
all the specific properties of small 
hypotheses will contribute only to 
“noise.” 

This theory is plausible, but testing it 
is very difficult. Almost never can S 
say after making a choice what the 
hypothesis was which led him to make 
it. Until someone develops a technique 
for finding out what these small hy- 
potheses are, or at least for finding out 
when one small hypothesis is discarded 
and another one takes its place, argu- 
ments like this remain 
speculative. 

If trial and error may be considered a 
form of sequential assimilation of non- 
sequentially presented information, then 
the same theory should apply to prob- 
lem-solving and concept-formation ex- 
periments. Such experiments § differ 
from those presented here primarily in 
in that certain small hypotheses can be 
correct. It seems likely that this theory 
more easily tested in an 


one must 


might be 
experiment in which a correct small 
hypothesis exists. 


SUMMARY 


This paper reports two experiments in which 
probability of reward, amount of reward, and 
the nature of the information available to Ss 
about the reward pattern were systematically 
varied in a two-alternative situation in which Ss 
operated a slot machine for real money. It was 
found that both probability and amount of 
reward strongly influenced choices, and in- 
formation about reward pattern influenced 
choices to some extent. Three generalizations 
emerged from the data: 


1. Other things being equal, the more fre- 
quently rewarded an alternative is the more 
likely S is to choose it. If the reward on the 
less frequently rewarded side is larger, the effect 
of probability is largely overcome. Exact pre- 
dictions about the rank ordering of all possible 
combinations of effects of this sort (and about 
the effects of information about reward pattern) 
can be made by means of a decision rule remotely 
derived from the statistical decision function 
concept of loss. The correlation between these 
predictions and the data is .96. 


187 


2. The asymptotic probability of choice is 
more extreme than the probability of reward, 
and as the difference between the probability of 
reward and .5 increases, the difference between 
the asymptotic probability of choice and the 
probability of reward also increases until the 
asymptotic probability of choice becomes | or 0 
and so can change no more. (This generaliza- 
tion only applies if the rewards are of the same 
size for the two alternatives, and if the proba- 
bilities of being rewarded for the two alternatives 
add up to one.) 

3. The Ss develop two kinds of hypotheses 
about reward sequences. Big hypotheses define 
classes of small hypotheses. Small hypotheses, 
which usually specify sequences of choices, 
cannot be correct, and so are constantly being 
tried, abandoned, and replaced with other small 
hypotheses. 


REFERENCES 


. Burke, C. J., Esres, W. K., & Hecrver, S. 
Rate of verbal conditioning in relation to 
stimulus variability. /. exp. Prychol., 
1954, 48, 153-161. 

. Busn, R. R., & Mosrerrer, F 
models for learning. New York 
1955. 

. Detamper, M. H. A reanalysis of Hum 
phreys’ “Acquisition and extinction of 
verbal expectations.” Unpublished 
master's thesis, Indiana Univer., 1950. 

. Epwarps, W. The theory of decision 
making. Psychol. Bull., 1954, 51, 380 
417. 

. Epwarps, W. A mathematical model for 
two-alternative learning which 
sequential dependencies. Unpublished 
draft. (Laboratory Note 54-2, Arma 
ment Systems Personnel Research Labo- 
ratory, Air Force Personnel and Training 
Research Center, Lowry Air Force Base, 
Denver.) 

. Estes, W. K. 
of learning. 
94-107. 

. Estes, W. K., & Burxe,C. J. A theory of 
stimulus variability in learning. Psychol. 
Rev., 1953, 60, 276-286. 

. Estes, W. K., & Srraucnan, J. H. Analy- 
sis of a verbal conditioning situation in 
terms of statistical learning theory. / 
exp. Prychol., 1954, 47, 225-234. 

. Froop, M. M. Environmental non-sta 
tionarity in a sequential decision-making 
experiment. In R. M. Thrall, C. H. 
Coombs, & R. L. Davis (Eds.), Decision 
processes. New York: Wiley, 1954 


Stochastic 
Wiley, 


admits 


Toward a statistical theory 


Psychol. Reo., 1950, 57, 











188 WARD EDWARDS 


10. Gauner, W.R.,& McGitt, W. J. Relation 
between uncertainty, variance, and cor- 
relation analyses. Psychometrika, in 
press. 

11. Goopwow, J. J. Determinants of choice- 
distribution in two-choice situations. 
Amer. J]. Psychol., 1955, 68, 106-116. 

12. Goopnow, J. J., & Postman, L. Learning 
in a two-choice probability situation with 
a problem-solving setting. J. exp. Psy- 
chol., 1955, 49, 16-22. 

13. Grant, D. A., Haxe, H. W., & Hoanseru, 
J. P. Acquisition and extinction of 
verbal expectations in a situation analo- 
gous to conditioning. J. exp. Psychol, 
1951, 42, 1-5. 

14. Hake, H. W., & Hyman, R. Perception of 
the statistical structure of a random 
series of binary symbols. J. exp. Psy- 
chol., 1953, 45, 64-74. 

15. Janvix, M. E. Probability learning and 
a negative recency effect in the serial 
anticipation of alternative symbols. /. 
exp. Psychol., 1951, 41, 291-297. 


16. McGitt, W. J. Multivariate transmission 


7 


8 


of information and its relation to analysis 
of variance. Hum. Factors Operations 
Res. Lab. Rep., Air Res. €% Develpm. Com- 
mand, MIT,1952. (Rep. No. 32.) 

. McGit, W. J. Multivariate information 
transmission. Psychometrika, 1954, 19, 
97-116. 

. Mivrer, G. A., & Mapow, W. G. On the 
maximum likelihood estimate of the 
Shannon-Wiener measure of information. 
Operational Applications Lab. Rep., Bol- 
ling AF B Cambridge Res. Cent., Air Res. 
(f Develpm. Command, 1954. (Rep. No. 
AFCRC-TR-54-75.) 

. Savace, L. J. The theory of statistical 
decision. J. Amer. statist. Ass., 1951, 46, 
55-67. 

. Savace, L. J. The foundations of statistics. 
New York: Wiley, 1954. 


(Received August 11, 1955) 











Journal 


Experimental Psychology 
Vol. 52, . 


0. 3, 1956 


% 


PERFORMANCE UNDER OPTIMAL PRACTICE CONDITIONS 
FOLLOWING THREE DEGREES OF MASSING 
OF EARLY PRACTICE 


JOHN M. DIGMAN 


University of Hawaii 


A considerable number of studies 
in the area of motor skills have been 
concerned with the concepts of re- 
active inhibition (J/g) and conditioned 
inhibition (s/z). While the theo- 
retical papers of Ammons (2) or 
Kimble (6) ought to be consulted for 
explicit statements concerning the 
presumed functions of these con- 
structs, the following points have 
been more or less assumed. (a) 
Habit growth of the desired response 
proceeds at the same rate under either 
massed or distributed practice. (6) 
Under massed practice, performance 
is lowered as a result of the gradual 
accumulation of /», which dissipates 
spontaneously with rest. (c) With 
considerable massing of practice, and 
accompanying /z, a permanent decre- 
ment in performance will result. This 
permanent decrement is a function of 
the presence of s/p, a construct some- 
what akin to the construct of habit, 
in the sense that its effect must be 
taken into account in making pre- 
dictions of future performance. Al- 
though the evidence for Jz is a labora- 
tory commonplace, a review by Ellis 
(5) finds the evidence for slp “frag- 
mentary.” Contributing to this 
judgment are the studies of Adams 
(1), Reynolds and Bilodeau (8), and 
Archer (4). 

A chief difficulty encountered in 
these and other studies in this area 
has been that the constructs used have 
often appeared clearer logically than 
they have in practice. This is espe- 
cially true of s/”, a demonstration of 








189 


which might logically be made by a 
comparison, on the first postrest trial, 
of the performance of a previously 
massed practice group with the per- 
formance of a previously distributed 
practice group. This, however, is 
complicated by the phenomenon of 
“warm-up.” Granted the fact of 
warm-up, the demonstration of s/p 
might best be made by comparing the 
long run postrest performance of a 
number of groups which differ in 
respect to the degree of prerest 
massing of practice. One attack on 
this was made by Ammons (3), whose 
postrest groups were all under massed 
practice conditions. The results of 
the study were somewhat ambiguous, 
and in attempting to account for them 
the author introduced another possible 
explanatory principle—the _ possi- 
bility of stimulus generalization from 
one kind of prerest practice condition 
to another kind of practice condition 
following rest. 

The present study was concerned 
with an attempt to discover the effects 
of three degrees of massing of practice 
of early trials on long run postrest 
performance, when the postrest per- 
formance is under relatively ideal 
practice conditions, ideal in the sense 
that trials are well distributed and 
knowledge of results is quickly given 
at the conclusion of each trial. 


Metuop 
Subjects. —The Ss were 41 female students 


from elementary psychology classes at the 
University of Hawaii. All Ss volunteered for 





190 


the experiment. One S was dropped from the 
experiment when she gave evidence of having 
participated in a similar experiment in the past. 
The Ss were assigned to the various experimental 
conditions in order of their appearance. 

Experimental conditions.—Depending on the 
relative massing of practice in the prerest trials, 
there were three groups of Ss, characterized as 
follows: Group M,—Ss in this group received 
16 trials of 20 sec. each, separated by 10-sec. 
rests; Group Mz—Ss received 5 min. and 20 sec. 
of practice with no rest; Group D—Ss received 
16 trials of 20 sec. each, with a 5-min. rest after 
Trials 4, 8, 12, and 16; and 10-sec. rests after all 
other trials. In addition, all groups. received 
four initial trials of 20 sec. each, separated by 
10-sec. rests. This was done in an effort to 
obtain some index of differential initial ability. 
After Trial 20, there was a 5-min. rest, and then 
all groups were given 30 additional trials of 20 
sec, each, with 10-sec. rests following odd num- 
bered trials and 2-min. rests following even 
numbered trials. One week intervened between 
Trials 30 and 31. 

Apparatus.—A pursuit rotor manufactured 
by the Ralph Gerbrands Co. was used. In this 
model the target button is 12.5 mm. in diameter, 
and is imbedded in a Bakelite dise 25.5 em. in 
diameter. The target button revolved in an 
epicyclic pattern at the rate of 30 rpm. A 
hinged brass stylus, 25 em. in length, hinged 12 
em, from the target end, and with a tip 5.5 mm. 
in diameter, completed a circuit with a scoring 
clock whenever the tip was in contact with the 
target button. ‘Target to floor distance was 35 
in. An interval timer manufactured by the 
Ralph Gerbrands Co. was used to obtain the 
desired work-rest cycles of 20-sec. work, 10-sec. 


rest. Longer rests were timed with a stop 
watch. 
Procedure.—The Ss were carefully instructed 


as to the proper stance before the rotor and the 
proper method of holding the stylus. They 
were told that the experiment was concerned 
with learning, that they would undoubtedly 
improve their scores with practice, and that 
their scores would be told to them at the end of 
each 20 sec. of practice. In the case of the 
continuous practice group, this latter feature of 
the experiment was accomplished by switching 
the interval timer over to the clock motor circuit 
and reading the clock during the 10-sec. “off” 
phase. (The Mz group, therefore, received less 
knowledge of results than did the other two 
groups.) The Ss were told that a score of 20 
indicated a perfect score, were shown the clock 
circuit, and had the general method of scoring 
explained to them. ‘The £ indicated his role in 
the experiment and further told S that he would 
warn her with “Ready” about 2 sec. before the 


JOHN M. DIGMAN 


194 


Time OW TARGET (SEC) 











4 10 M ° 23 


rt 2 30 
20 SEC. TRIALS 


Fic. 1. Performance on pre- and postrest 
trials as a function of degree of distribution of 
early trials. 


beginning of each performance period. The Ss 
in Group D were asked to sit down during the 
5-min. rest periods. Some ancient magazines 
were at hand, and discussion was kept to a 
minimum. In general, morale seemed high, and 
interest in the experiment was keen. 


RESULTS 


Prerest performance.—The results 
for the first 20 trials are rather much 
as one would anticipate (see Fig. 1). 
Group D, the distributed practice 
group, displays, in general, a nega- 
tively accelerated curve of perform- 
ance, with appreciable reminiscence 
after each rest. Group M, shows 
rapid improvement up to Trial 3. 
Past this point, improvement is very 
gradual, with performance on Trial 
20 considerably below the level of 
Group D. Group Mz: shows im- 
provement up to Trial 5, thereafter a 
steady decline in performance to 
Trial 20. At the end of the first 20 
trials, the performance of Group M, 
is below its performance on Trial 2, 
and only slightly above its perform- 
ance on Trial 1. 

Postrest performance.—On the basis 
of the within-groups regression co- 
efficients, based on the matching 
variable of sum of Trials 1-4 and each 
postrest trial, the means for all trials 
beyond Trial 20 were adjusted, giving 
predictions of what performance on 

















PERFORMANCE FOLLOWING MASSING OF PRACTICE 


these trials would have been if the 
three groups had been matched in 
respect to initial ability. The cor- 
relations associated with these re- 
gression coefficients are mostly in the 
neighborhood of .55. On the average, 
the unadjusted means of Group D 
are 40 sec. higher than the adjusted 
values; the unadjusted means for 
Group Maz, .39 sec. lower. 

Following the 5-min. rest after 
Trial 20, Group Mz displays con- 
siderable recovery, almost doubling 
its score on Trial 21 over its score on 
Trial 20. Characteristically, follow- 
ing such massing of practice, there is 
a display of warm-up in Groups M, 
and My,. Performance for these 
groups, as indicated by the adjusted 
values for Trials 28-30, levels off only 
slightly below the level of Group D. 
For the distributed practice group, 
performance on the first four postrest 
trials is about the same as immediately 
before the rest. 

An analysis of covariance, based 
upon the matching variable of Trials 
1-4, demonstrates that the groups 
differ significantly on Trial 21 (F 
= 22.26, 2 and 38 df). 

Performance after one week.—Fol- 
lowing Trial 30, there was an interval 
of one week before practice was again 
resumed. All groups were treated as 
they had been on Trials 21-30. As 
was the case with Trials 21-30, the 
means of Trials 31-50 were adjusted 
by use of the within-groups regression 


> 





TIME_ON TARGET (SEC) 
°o 





i 56 40 

20 S€C TRIALS 
Fic. 2. 

function of 

trials. 


Performance after one week as a 
degree of distribution of early 












=e4 tis 
8 4 
w 
wi 
= 
°o 
uv 
¥ 24 WARM-UP 
= MAge---@ 
ea) REL EARNING 
a 
S*s—e 
* 7 7 _s as As J 7 
re 2 Oe 3 Bast .:2 
POSTREST TRIALS 
Fic. 3. Postrest warm-up curves (massed 
practice groups) and relearning curves (all 
groups). 
coefficients between the matching 


variable of Trials 1-4 and the indi- 
vidual trials after 30 (see Fig. 2). 

A proof of the null hypothesis is, 
of course, out of the question. Never- 
theless, the fact that these means 
‘differ as little as they do is of con- 
siderable interest. Particularly note- 
worthy is the performance of all three 
groups on the first recall trial. While 
the difference is microscopic, what 
difference does exist favors Group 
M;. For the last five trials, 46-50, 
the mean performances of the three 
groups are: Group D, 15.94; Group 
M,, 16.08; Group Mg,, 15.97. For 
Trials 31-50, the mean performances 
of the groups are: Group D, 15.04; 
Group M,, 14.90; Group Mg, 15.03. 

Another aspect of the data, of some 
interest, is a comparison of the warm- 
up curves of Groups M, and M, after 
rest (Trials 21-27) with the warm-up 
trials of all three groups after the 
one-week interval (Trials 31-37). In 
Fig. 3, this comparison is made by 
plotting trials after rest along the 
abscissa and improvement over the 
first postrest trial along the ordinate. 


Discussion 


Following the interval of one week 
after Trial 30, the relearning curves of 
all three groups bear such a striking 
similarity to one another that one is 
easily led to believe that Ss in these three 














192 


groups are operating at substantially the 
same level of performance, despite their 
having had quite different degrees of 
massing of practice during the first 40% 
of practice. The data suggest, quite 
strongly, that, at least so far as the 
conditions of the experiment permit 
generalization, the rather heavy amount 
of massing for Group Mz had no decre- 
mental effect beyond the first few warm- 
up trials following the first rest. Since 
all three groups have practically the 
same means on the first recall trial and 
are remarkably similar beyond that 
point, it may be assumed that the 
temporary differential in favor of the 
initially well-distributed practice group 
was dissipated rather soon after the 
initially massed groups shifted to dis- 
tributed practice. The small difference 
on Trial 30 in favor of the distributed 
practice group (Group D) may be a 
chance affair (it did not approach sig- 
nificance), or it may be due to incomplete 
dissipation of Jp. 

The data of the experiment, then, 
might be explained by recourse to three 
concepts: learning, temporary work 
decrement, and warm-up. However, if 
one assumes that the rate of habit growth 
is the same under conditions of massed 
and distributed practice, and that per- 
formance is only temporarily depressed 
by Zr, one ought, logically, not to expect 
to find warm-up. As Ellis (5) points 
out, one would predict that both practice 
conditions would result in equal per- 
formance following a rest, if, as seems 
the case, the construct of permanent 
work decrement is unnecessary. How- 
ever, the two massed practice conditions, 
while giving no evidence of permanent 
decrement, do suggest that, with Jp well 
dissipated, a previously massed practice 
group will not operate at the same level 
of performance as a formerly distributed 
practice group, at least for the first few 
trials after rest. This postrest decre- 
ment is sometimes referred to as a “need 
to warm up,” though the connotation of 
the term suggests something more dy- 
namic than the mere convergence of two 
performance curves. One explanation of 
the warm-up effect has been advanced by 


JOHN M. DIGMAN 


Ammons (3): under massed practice 
there is a gradual loss of set, which is 
re-established during the first few post- 
rest trials. 

For another possible explanation, one 
might cast the entire argument in terms 
of transfer theory. According to this 
view, one would conceive of massed 
practice as a situation in which a typical 
§ learns an adjustive act, adjustive to 
his own particular needs, the demands of 
the experiment, and the need generated 
by the gradual accumulation of Jp. This 
behavior will transfer positively to a 
subsequent situation where S§ is prac- 
ticing under distributed practice con- 
ditions. The net effect of the transfer 
is positive, but certain elements of the 
task may transfer negatively; e.g., a 
particular stance, which serves to miti- 
gate Jp under massed practice, may well 
be a hindrance under distributed prac- 
tice. These negative elements are 
present on the first postrest trial, but 
are gradually dropped out as practice 
proceeds, a phenomenon usually as- 
sociated with negative transfer and 
retroactive inhibition (7). With con- 
tinuing practice under the _ postrest 
conditions, the two performance curves 
will gradually converge, depending on 
the traditional variables affecting trans- 
fer (set, similarity of tasks, degree of 
training, etc.). 

The same argument could be used to 
interpret the warm-up displayed after a 
long interval of no practice. Such 
decreases in retention have generally 
been explained in terms of retroactive 
and proactive inhibition, which, on 
analysis, are fundamentally transfer 
effects. According to the above view, 
there are acts learned in other situations, 
elements of which transfer negatively to 
the experimental task. These retro- 
active and proactive effects are transi- 
tory, evident on the first recall trial, less 
evident as relearning progresses. 

By extension, one might also view the 
rapid improvement in performance in 
the earliest stages of practice as a similar 
kind of warm-up, wherein elements from 
nonexperimental activities temporarily 
intrude. Once these aspects are di- 











minished in effect, performance, if under 
ideal practice conditions, proceeds more 
slowly, and is a closer approximation to 
the growth of habit strength of the 
elements of the task associated with 
optimum performance of the skill. 


SUMMARY 


The problem concerned the relatively long- 
run effects of massing of practice during the 
early stages of learning on subsequent per- 
formance, when that performance is under 
relatively ideal conditions of practice and 
motivation. 

Female Ss were divided into three groups, 
representing three degrees of massing of practice. 
One group (N = 14) received four 20-sec. trials 
separated by 10-sec. rests, followed by 5 min. 
and 20 sec. of continuous practice. A second 
group (N = 13) received twenty 20-sec. trials, 
separated by 10-sec. rests. A third group 
(N = 14) had, in addition to the 10-sec. rests 
between trials, 5-min. rests after every fourth 
trial. Following Trial 20, all groups were given 
30 well-distributed trials, with a 10-sec. rest 
after every odd-numbered trial, a 2-min. rest 
after every even-numbered trial. One week 
intervened between Trials 30 and 31. Knowl- 
edge of results was continually furnished to all 
Ss, by informing them at the conclusion of each 
trial of the number of seconds spent on the 
target. 

When the postrest performance curves of the 
three groups were adjusted for differences in 
initial ability, there was no evidence of perma- 
nent decrement. Temporary decrement was 
present on the first postrest trial, but was ap- 
parently quickiy dissipated. The performance 
curves of the three groups, following the one- 
week interval between trials, were remarkably 
similar. 





PERFORMANCE FOLLOWING MASSING OF PRACTICE 


193 


Warm-up after massed practice and during 
relearning, and the failure to find evidence of 
permanent work decrement, were explained in 
terms of transfer theory. 


REFERENCES 


1. Apams, J. A. Warm-up decrement in per- 
formance on the pursuit rotor. Amer. J. 
Psychol., 1952, $5, 404-414. 

2. Ammons, R. B. Acquisition of motor skill: 
I. Quantitative analysis and theoretical 
formulation. Psychol..3Rev., 1947, SA, 
263-281. 

3. Ammons, R. B. Acquisition of motor skill: 
Il. Effects of initially distributed prac- 
tice on rotary pursuit performance. /. 
exp. Psychol., 1950, 40, 777-787. 


4. Ancner, E. J. Postrest performance in 
motor learning as a function of prerest 
distribution of practice. J. exp. Psychol, 
1954, 47, 47-51. 

5. Eusuus, D. S$. Inhibition theory and the 
effort variable. Psychol. Rev., 1953, 60, 
383-392. 


6. Kimpte, G. A. An experimental test of a 
two-factor theory of inhibition. J. exp. 
Psychol., 1949, 39, 15-23. 

7. Menton, A. W., & Inwin, J. McQ. The 
influence of degree of interpolated learn- 
ing on retroactive inhibition and the 
overt transfer of specific responses. 


Amer. J. Psychol., 1940, $3, 173-203. 


8. Reyrno.ips, B., & Biropeau, I. McD. Ac- 
quisition and retention of three psy- 
chomotor tasks as a function of distri- 
bution of practice during acquisition. 


J. exp. Prychol., 1952, 44, 19-26, 
(Received August 29, 1955) 





Journal of Experimental Psychology 
Vol. 52, No. 3, 1956 


FAMILIAR SIZE AS A CUE TO SIZE IN THE 
PRESENCE OF CONFLICTING CUES! 


CHARLES W. SLACK? 


Princeton University 


One special aspect of the nativism- 
empiricism problem, the role of fa- 
miliar size as a cue to size and dis- 
tance, was suggested by Helmholtz 
(2, p. 282) and has interested such 
modern investigators as Ittelson (4). 
In Ittelson’s experiment, familiar size 
was shown to operate as a cue to size 
and distance, but no attempt was 
made to provide S with cues other 
than familiar size. It has been shown 
that S will rely upon his past experi- 
ence with an object in determining its 
size and distance under conditions of 
one-eyed viewing in a completely dark 
surround. Under these conditions, S 


could not respond to an absolute 
distance or size question on the basis 


of information gained from stimu- 
lation alone. When the object is a 
familiar one with a familiar size, S 
seems to provide the needed informa- 
tion from his experience with similar 
objects in the past. How much, if 
any, his past experience will influence 
his judgments of size and distance 
under conditions like those of Gibson 
(1), where other cues are in abun- 
dance, is a problem open for investi- 
gation. 

A closely related topic is that of 
size constancy. In contrast to theo- 
rists like Pratt (5) who would define 
constancy exclusive of S’s past ex- 
perience, Ittelson defines constancy 


! The author is very greatly indebted to Mr. 
Peter D. Horne for his assistance in gathering 
the data and preparing the report. Some of the 
paper is based upon Mr. Horne’s thesis for the 
degree of Bachelor of Arts in Science at Princeton 
University. 

* Now at Harvard University. 


“ 


as: “ ... the attempt of the indi- 
vidual to create and maintain a world 
which deviates as little as possible 
from the world which he has experi- 
enced in the past . ” (4, p. 292). 

Any experiment which utilizes a 
stimulus, the physical size of which is 
equal to the familiar size, cannot 
answer the question as to whether 
size constancy is a regression toward 
reality or toward a notion derived 
from S’s particular past. To answer 
these questions, experiments should 
be done in which objects which have 
a familiar size are presented to S in 
other than that size and under con- 
ditions where other cues are present 
in quantity sufficient to determine the 
apparent size and distance of the 
object. 

The present study attempts to 
investigate the effect of familiar size 
(sometimes called known size or 
assumed size), a subjective variable, 
upon apparent size. It investigates 
this effect under conditions like those 
used by Gibson where there are many 
other cues present, the experiment 
being done outdoors in a situation 
complete with gradients of texture and 
other indications to depth. The S is 
allowed the use of two eyes and some 
head movements as well. 

In this experiment Ss were pre- 
sented with three chairs of varying 
sizes but proportional 
smaller than usual, normal, and 
larger than usual, and three sticks as 
controls. The experiment was de- 
signed to determine whether there was 
a discrepancy between the apparent 


dimensions: 





FAMILIAR SIZE AS A CUE TO SIZE 


sizes of the chairs (chairs assumed to 
be familiar objects with a narrow 
experience-determined range of char- 
acteristic heights) and the apparent 
size of sticks (sticks assumed to be 
neutral objects with a wide range of 
characteristic heights). A theory of 
constancy like that of Ittelson would 
predict a systematic difference in the 
apparent sizes of the chairs and sticks, 
and a particular direction to the 
difference. The large chair would be 
judged smaller than a stick of its size, 
the small chair would be judged larger 
than a stick of its size, and the normal- 
sized chair would be judged equal to 
a stick of its size. The Ss would tend 
to regress their apparent size judg- 
ments toward the assumed or familiar 
size; familiar size would operate to 
make objects smaller than expected 
seem larger and objects larger than 
expected seem smaller even in the 
presence of many other non-object- 
determined cues. 


Metuop 


Subjects and apparatus.—Twelve male college 
students who reported 20/20 vision (four wore 
glasses) were used as Ss. All were naive as to 
the purpose and content of the experiment. 

Three wooden chairs and three wooden sticks 
were used as viewing objects. The chairs had 
heights of 58, 344, and 254 in. The seat height 
of each chair was a little more than § the total 
height. The chairs looked like typical prison- 
made schoolroom chairs. They were armless 
and had two rather wide horizontal slats in the 
back. The sticks were 2 XK 4in. 
lumber equal in height to the chairs. All 
objects were painted a dark walnut color. An 
adjustable steel tape attached to the ground at 
one end and held by & at the other was used as 
the comparison object for 
responses. 


pieces of 


measuring S's 


The experiment was performed outdoors on 
a large flat field. ‘The background contained 
trees and shrubs at a distance of about 300 yd. 
from the observation point. The S always 
remained standing in the same place, facing in 
the same direction, and the comparison object 
was located 3 ft. in front and 5 ft. to his right. 
The comparison tape was held and manipulated 


195 


by E as S told him to raise or lower it until the 
desired match was obtained. 

Procedure.—Height (constancy) judgments 
by the method of reproduction were made of the 
objects from distances of 20, 30, and 40 yd 

The chairs were presented first to all Ss. 
Each chair was observed at all three distances 
before it was replaced by the next object. The 
sticks were presented after the chairs, each stick 
being observed at all three distances before the 
next stick was presented. One trial, then, con- 
sisted of three judgments on the same object 
Each S had six trials. 
tances and the starting points of the comparison 


The three viewing dis 


object were arranged in random order of pre 
The order of presentation of the 
sizes of chairs and sticks was arranged so that 
each combination of order of presentation of 
sizes was given totwo Ss. Therefore, the order 
of presentation of sizes was counterbalanced 
over Ss; the order of presentation of object type 


sentation. 


was constant for all Ss; the order of distances 
was random from object to object; and the order 
of starting height of the comparison tape was 
from This 
design has obvious flaws, but the bidirectional 
nature of the expected results makes alternative 
explanations by design factors very difficult. 
There is a possibility that results might be 
different if sticks were presented before chairs, 
but it would be hard to account for the expected 
results in terms of order of presentation, 

The Ss were given the following instructions 
“I want you to judge the height of the objects 
which you will see in front of you 


random judgment to judgment 


Instead of 
making your judgments in feet or inches, I will 
be standing a little in front and to the right of 
you with a tape which I will move up or down 
as you direct me, and I want you to tell me when 
the height of the tape is the same height as the 
object you see. By ‘same height’ | mean that 
if both the height of the object itself and the 


height of the tape were measured in inches, they 
would be equal. 
same manner as you would judge the 
things every day, and try to make them within 
ten or fifteen seconds if possible 


Make your judgments in the 
ght of 


However, I 
don’t want to sacrifice accuracy for time, so if 
you need more time to make 
take it. After each judgment please put on the 
blindfold until I tell you to remove it 


your judgment, 


again 
Please do not look behind you as that is where 
other apparatus is kept 
questions 

The Ss wore blackout 
judgments while E was moving or replacing the 


Do you have any 


goggles in between 
objects 

Fach of 12 Ss observed the six objects once 
at each distance, totaling 18 judgments per S. 


The chairs were set facing at a 45° angle to the 





CHARLES W. SLACK 


TABLE 1 


Arranent Size oy Cuains anv Sticks at Turee Distances Averacep over Ss 
(Measurements are of apparent height in inches.) 








Object 


30 yd. 





Mean SD 





58-in. chair 
58-in. stick 


344-in. chair 
344-in. stick 


254-in. chair 
254-in. stick 








5.25 
3.72 


3.83 
2.93 


3.36 
3.94 


59.30 3.62 
62.41 4.23 


41.76 3.17 
39.87 2.90 


33.54 4.16 
30.54 3,98 


58.07 
63.91 


42.18 
41.59 


34.54 
31.50 

















left of S to enable him to see all the legs. The 
sticks were set vertically in holes in the ground. 


RESULTS 


Table 1 gives the apparent height 
of all objects at three distances 
averaged over Ss and the SDs between 
Ss. It can be seen from this table 
that the results are all in the predicted 
direction. The apparent size of the 
chairs is regressed toward the familiar 
size of a chair; the large chair is seen 
smaller than its control stick, the 
small chair is seen larger than its 
control stick. By one-tail ¢ test, all 
differences between chairs and sticks 
at the three distances and two sizes 
(the six extreme differences) were 
significant beyond the 5% level. 

However, the intersection of the 
curves is not at the predicted point. 
It was predicted that they would 
cross at the normal-sized chair. Only 
at the nearest distance, however, can 
the null hypothesis be rejected by a 
two-tail ¢ test. 

Overconstancy, a tendency to see 
far objects larger than near objects of 
the same physical size, was found in 
all cases. 


Discussion 


The results are taken as confirmation 
of the hypothesis that familiar size plays 


a significant role in the determination of 
apparent size even in the presence of 
many other conflicting cues such as 
gradients, binocularcues, andothers. In 
Table 2 we may interpret the variance 
between sizes as that due to the operation 
of all other cues in the situation. The 
interaction O X S variance may be as- 
sumed to be the general effect of familiar 
size averaged over subjects and condi- 
tions. The low variance between objects 
shows that we happened to pick objects 
which on the average over size did not 
differ from one another except in the 
property of familiar size. Actually, we 
expected the curves to cross not in the 
middle (average) but at the point of the 
**normal’’-sized chair, or 344 in. At one 
distance these points were significantly 


TABLE 2 


AnaALysis oF THE VARIANCE oF APPARENT 
Size JupGMEnTs 








Source df 





Subjects and orders 


Objects 
Sizes 
Distances 


Ne NNN Ne 


Residual 


- 
oo 





Total 











*P < O1. 





FAMILIAR SIZE AS A CUE TO SIZE 


different. If we wish to explain this 
fact post hoc, we might say either or both 
of two things; either we did not pick 
just the right normal size for a chair—we 
should have picked a larger normal chair, 
—or the massiveness of the chairs as 
compared with the sticks may have made 
the chairs seem taller, especially at the 
shortest distance. The crude prediction 
for the theory about constancy and 
familiar size, however, concerns the 
relative slopes of the two curves, not their 
intercepts. 

The significant variance attributable 
between distances shows that over- 
constancy tends to increase with dis- 
tance. The general overconstancy is a 
result for which we will not attempt to 
account. By now it is the typical 
result in multi-cued constancy experi- 
ments such as Smith’s and others men- 
tioned by him (6). We feel fairly 
certain, however, that our results are 
not peculiar to those circumstances 
where overconstancy is typical since in 
a situation essentially the same as one 
where Holway and Boring (3) got great 
underconstancy, Ittelson (4) got almost 
complete dominance of familiar size. 
Underconstancy seems to obtain with 
reduced cues (3) and the importance of 
familiar size as a determinant of the 
response increases as the other cues are 
reduced. 

When the difference in apparent height 
between chairs and sticks is taken as a 
percentage of the total apparent height 
of the chairs, it varies from 4% to 10% 
depending upon other conditions. Fa- 
miliar size, as a cue to size, seems to 
make a contribution of the same order 
as binocular vision if we are to judge by 
the results of the Holway-Boring study 
(3). 
conditions designed to produce constancy 
were used, the effect of adding binocular 
vision was to increase the apparent size 
from constancy to about 10% 
constancy. 


In their experiment, when indoor 


over- 


It is probably a mistake, however, to 
talk about the relative importance of 


several cues in the abstract. The com- 
plex interrelationships which exist be- 


197 


tween cues make it almost impossible to 
determine the strength of a cue without 
specifying the situation in a rather com- 
plete manner. 

In Ittelson’s study, familiar size was 
shown to be a cue both to size and to 
distance (4). No attempt was made, in 
the present experiment, to see whether 
the objects would be appropriately 
mis-localized in distance according to 
their familiar size. We might expect 
that the large chair would be seen as 
nearer and the small chair as farther than 
the sticks at the same distance. This 
kind of study remains to be uttered. 

We feel that the results of this study, 
taken with the results of the former 
study by Ittelson, show that familiar 
size is a cue to apparent size, and an 
important one. It has the character- 
istic of being obviously related only to 
the history of § and not to any physical 
properties of the source of stimulation 
except as those properties or their 
psychological derivatives are related to 
his history. Other cues—indeed, all 
other cues—may turn out to have this 
characteristic but, at present, this is 
not obvious to everyone. Any cue may 
be defined in terms of the experiencing 
individual; familiar size must be. 


SUMMARY 


An experiment was conducted in which chairs 
of varying sizes—small, normal, and large—were 
presented to Ss at distances of 20, 30, and 40 
yards. Size-constancy (attempted “real” size) 
judgments were obtained by the method of 
reproduction with a variable manipulated by £ 
which was three ft. in front and five ft. to the 
right of S. The experiment was conducted 
outdoors on a large flat field. It was predicted 
that, even in the presence of other cues such as 
binocular and gradient cues, familiar size would 
operate to make S judge the small chair to be 
larger and the large chair to be smaller than 
sticks of the same height as the chairs. 

The results agreed with the predictions in a 
highly significant fashion, indicating that 
familiar size should properly be considered as 
one of the cues to apparent size. 

The results have bearing on theories of size 
constancy, since they show that constancy is, at 


least in part, a regression toward past experience. 





CHARLES W. SLACK 


REFERENCES variant. Amer. J. Psychol., 1941, 54, 
21-37. 

1, Gipson, J. J. The perception of the visual 4. Irretson, W. H. The constancies in per- 
world. Cambridge: Houghton Mifflin, ceptual theory. Psychol. Rev., 1951, 58, 

1950. 285-294. 
" [am 4 eis 
2. Hetmuortz, H. vow. Physiological optics, 5. se dan he Pratal’ on be. 
Vol. 3 (Trans. by J. P. C. Southall). 85-107, ss eee — 
Ithaca, N. Y.: Optical Society of America, ¢, Sarru, W. M. A methodological study of 
1925. size-distance perception. J. Psrychol., 


3. Hotway, A. H., & Borinc, E.G. Determi- 1953, 35, 143-153. 


nants of apparent visual size with distance (Received August 25, 1955) 





Journal of Experimental Psychology 
Vol. 52, No. 3, 1956 


STEREOPSIS PRODUCED WITHOUT HORIZONTALLY 
DISPARATE STIMULUS LOCI 


PAUL C. SQUIRES 


Medical Research Laboratory, Human Engineering Branch, U.S. Naval Submarine Base, 
New London, Conn. 


Ogle states: “The concept that 
stereoscopic perception of space de- 
pends on the combination of the 
perceived forms of figures seen by the 
two eyes has been expressed before, 
for example, by Lau. He claims to 
have experienced stereoscopic depth 
in the haploscope from figures whose 
stimuli for the stereoscopic depth 
were pure illusions and which did not 
actually introduce disparities be- 
tween the retinal images. Most ex- 
perimenters have failed to verify 
these experiments, however” (3, pp. 
198-199). 

Lau sums up his studies using the 
Zollner, Hofler, Poggendorf il- 
lusions, thus: “These results possess 
great theoretical significance. One 
perceives that here the cross-disparate 
stimulation of a system of identical 
points in the retinas does not call 
forth an impression of depth, but that 
each eye elaborates its stimulus com- 
plex into a Gestalt and that diverg- 
encies between these very Gestalten 
yield the perception of depth” (1, p. 
3). Furthermore: “By means of our 
experiments it becomes virtually evi- 
dent that stereopsis is not produced 
by point for point stimulation, but 
that the Gestalten as wholes ‘arrive 
at a comparison,’ so to speak. . . 
Therefore, we must have in the case of 
stereoscopic seeing a process which 
functionally resembles a comparison 
of configurational images” (2, p. 125). 

Wilde concludes: “Gestalt dis- 
paration alone thus furnishes depth” 
(6, p. 262). “Binocular depth has 
nothing whatever to do with cross- 


and 


disparation in the old sense of that 
term—depth is constructed in the 
first instance upon the ground of 
completed Gestalt units and is not 
possible without these” (6, p. 259). 
Tausch, following up Wilde’s work, 
says: “ the stimuli received by 
each eye are transformed into 
Gestalten and . . . only the disparity 
and correspondence of these Gestalten 
results in the activation of 
perception” (5, p. 419). 

Ogle, in a recent study, concludes: 
“The results clearly suggest that 
there may be two separate aspects of 
binocular depth perception. ... It 
would be preferable to call the first 
aspect of stereopsis an ‘obligatory’ 
type and the second a ‘facultative’ 
type. ...A facultative depth per- 


space 


ception from disparate images may 
be said to be in the category of the 
Gestalt perceptions. 


Its occurrence 
is not limited by specifically hori- 
zontally associated disparate retinal 
elements. Certainly on the basis 
of the results of the present experi- 
ment the inference that all of stereo- 
scopic depth perception belongs in the 
same category must be rejected” (4, 
pp. 232-233). But Ogle still insists 
that the “true,” the “obligatory” 
sense of depth requires the simul- 
taneous stimulation of horizontally 
disparate retinal elements (4, p. 233). 
Whether or not Ogle is correct is a 
matter of inquiry in this paper. 

As far as phenomenology is con- 
cerned, Ogle has well said: “Until 
more is known about the neuro- 
anatomy of the visual areas of the 


199 





200 


cortex and the association fibres 
between the two halves of the brain, 
we must be content to discuss fusion 
and the fusional processes in phe- 
nomenologic terms, leaving in abey- 
ance for the time being the exact 
meaning of these words” (3, p. 60). 

No study of Lau’s approach to the 
problem of stereoscopic depth per- 
ception exists in the English language, 
as far as we are aware. The present 
experiment takes its departure from 
Lau’s work, but makes a radical 
change in the mode of attack. 


Metuop 


Apparatus.—A Wottring Troposcope (Ameri- 
can Optical Co.) was used. This is one of the 
major anblyoscopes; built on the Wheatstone 
mirror principle, it serves as a reflecting stereo- 
scope. A space Eikonometer (American Optical 
Co.) testing was given each O; Eikonometer 
readings were made with extreme care for the 
detection of any possible aniseikonic deviations 
that might be regarded as clinical, but no such 
deviations were found, All Os perceived depth 
strongly in the Kikonometer. 

Observers.—There were five Os; three of these 
were not psychologically trained. The writer 
acted both as Zand 0. The £ took the greatest 
care to avoid making suggestion of a possible 
depth effect in the Troposcope; only £ had prior 
acquaintance with the apparatus. 

Visual patterns.—Four pairs of targets were 
used: the Héfler, the Zéllner, the Poggendorf, 
and the Miller-Lyer illusions (Fig. 1); each pair 
can be unified or “fused” in the Troposcope. 
The noteworthy feature of these targets is that, 
whereas the illusion pattern is presented to one 
eye, the lines-in-chief of this pattern are matched 
by corresponding lines shown to the other eye, 
no other parts of the illusory figure being present. 
By corresponding lines are meant horizontally 
nondisparate lines in the geometrical-physical 
sense. In presenting the pair of corresponding 
lines-in-chief we have taken the ultimate step 
which was not taken by Lau. It should be 
remarked, in strict fairness to Lau, that he did 
try one such pair (Zéllner), but failed to see 
significance in the pair (1, Fig. 1): as he says, 
“ |. , this result did not yet satisfy me... ” 
(1, p. 2). From here on, including (2), Lau 
utilized geometrical horizontal disparity for all 
his figures. 

Our patterns were fine-line ink drawings on 


PAUL C. SQUIRES 


a Y 


a 


/ 


Fic. 1. The Hofler, Zéllner, Poggendorf, and 
Mille:-Lyer figures employed in this experiment 
(in order from top to bottom). ‘The figure on 
the left was presented to one eye and the figure 


on the right was presented simultaneously to the 
other eye. 


translucent white cards 3} in. X 4 in.; these 
were placed between glass slides gy in. thick and 
inserted into the slots of the Troposcope where 
they were transilluminated. The Héfler pair 
was also reproduced on glass slides by photo- 
graphic process, using Kodalith film. 

The basic principle underlying our pattern 
pairs is, then, that they involve a purely phe- 
nomenal horizontal disparation. 


REsuULTs 


The Hofler illusion—When the two 
targets are fused, unified, in the 
Troposcope, one pair of intersecting 
lines is seen lying in a vertical plane 
and standing well out in front of the 
radial grid; one O saw the intersecting 
lines lying way back of the grid. The 





STEREOPSIS 


stereoscopic effect is not “facultative” 
(4, p. 233), but precise, strong, com- 
pulsive. The illusion of curvature, 
which exists for the monocular view, 
is either destroyed or practically 
destroyed when the pair of targets 
are unified in the Troposcope. The 
targets, as presented in photographed 
form on glass slides, gave a magnifi- 
cent stereo-depth appearance, the 
illusion of bending being virtually 
inappreciable.' 

To test configurational interaction, 
the Hofler pattern was modified so 
that the crossed lines-in-chief were 
prolonged to touch the circle, the pair 
of lines being geometrically congruent 
with those in the target for the other 
eye. Under binocular unification the 
resulting figure fails to evidence much 
of a stereoscopic effect; the two lines- 
in-chief, although their curvature is 
much reduced, never attain the mini- 
mum of curvature manifest in the 
fusion of the original pattern. Some 


bowing out toward O of the principal 
lines is perceived to take place in the 
central zone of the fused targets in 
the modified pattern: the crossed 
lines seem to be firmly attached to the 
circle; stereo-depth is almost “killed.” 


The Zéllner illusion—The same 
general effect is found here as has 
been described for the Hofler illusion 
pair. The four principal lines fuse 
and appear to be nearer to O than 
does the field of crosshatches; this 
was generally observed to be the case, 
although for some Os the principal 
lines sometimes appeared to be back 
of the crosshatches. 


1 Modifications of the Hofler pattern were 
made wherein 8 and 32 radial lines were used 
for the grid, giving a weaker and stronger Hofler 
effect, respectively, than the grid having 16 
radial lines. For the writer, the stronger the 
illusion, the greater the degree of stereo-depth; 
these observations were made by him after the 
completion of the body of the experiment. 


201 


Stereo-depth never entered into the 
binocular picture until the two sets 
of four principal lines became fused. 
When fusion took place, the fused set 
of lines were phenomenally parallel to 
one another; the Zollner effect van- 
ished either completely, or became 
small. When the fusion emerged 
excellently and with comparative ease, 
the stereoscopic effect was good. But 
the fusion in this pattern was not 
always easy to get; when a binocular 
struggle was in progress, depth was 
inchoate. Once the struggle was 
over, with the lines phenomenally 
parallel, the stereo-depth was good. 
The phenomenal parallelism was 
typically difficult to keep hold of, 
however. In no instance was there 
such a strong depth effect as in the 
Hofler pattern. 

The Poggendorf illusion.—It will at 
once be noticed that the slanting line 
is broken so as not to touch the 
vertical lines. Through this device 
quite a good stereo-depth was ob- 
tained; the broken-line segments stood 
out well from the plane of the vertical 
lines. Benefiting by our experience 
with the Hofler figures, we discovered 
that attachment of the slanting seg- 
ments to the vertical lines impov- 
erished the stereo-depth phenomenon. 
A slight detachment of the segments 
from the vertical lines sufficed to 
condition good, compulsive stereo- 
scopic depth. The Poggendorf il- 
lusion ceased to exist when fusion of 
the sets of slanting segments gave the 
stereo-effect. 

The Miller-Lyer illusion 
with this very briefly (1, p. 4, 
footnote 1). No design of this figu- 
ral pair is shown by him. He claims 
to have obtained a depth effect. He 
states that Kohler could not get the 
effect. Using a conventional set of 
Miller-Lyer figures, the most stereo- 


Lau dealt 








202 


depth we could ever get was only 
slight; there is a real question as to 
whether this “‘stereo-depth” could 
properly be so characterized. The 
Miller-Lyer pattern was modified by 
adding thereto the Zollner cross- 
hatches, thus forcing upon the pair 
of figures a phenomenal horizontal dis- 
parity; the result was good stereo- 
scopic depth. The binocular fusion 
yielded a pair of seemingly parallel 
lines which stood out in a vertical 
plane toward O, and apparently 
equal in length: the Miller-Lyer il- 
lusion was destroyed. 


Discussion 


This experiment has produced evi- 
dence showing that stereoscopic depth 
does not depend exclusively upon hor- 
izontally disparate stimulus loci. An 
obligatory, compulsory depth experience 
can be evoked by stimulus configur- 
ations that are only phenomenally dis- 
parate, no geometric-physical horizontal 
disparation being present. Striking evi- 
dence of this fact is afforded by the 
Héfler patterns; we are entirely unable 
to characterize the depth experience 
accompanying fusion as merely “‘fac- 
ultative’”’ (4, p. 233). True, obliga- 
tory, stereoscopic depth can be obtained 
where phenomenal horizontal disparity 
alone is present. 

In each of the four patterns in this 
experiment an illusory figure is presented 
to one eye, a nonillusory figure to the 
other eye. Thus, there is generated a 
phenomenal disparation, conflict, be- 
tween Gestalten; as Lau puts it, 
differences between these Gestalten yield 
perceived depth” (1, p. 3). 

When the members of the Héfler pat- 
tern are unified in the Troposcope, the 
unification is immediate, compulsive; the 
depth experience shows no tendency to 
disintegrate even under prolonged fixa- 
tion. Our findings are quite opposed to 
what may be called Ogle’s atomistic point 
of view, which holds that an obligatory 
sense of depth necessitates the simul- 


PAUL C. SQUIRES 


taneous stimulation of horizontally as- 
sociated disparate retinal elements (4, 
p. 233). 

Using the modification of the Héfler 
pattern wherein the principal lines touch 
the circle, an interesting thing is seen: 
although the illusion of curvature is 
much diminished in the Troposcope, the 
fused crossed lines do not detach from 
the circle and very slight depth difference 
is perceived to exist between the central 
region of the grid and the crossed lines in 
that region. Another configurational 
factor has entered to inhibit the spatial 
separation in depth of the crossed lines 
and the radial grid. The crossed lines 
seem to engage in a struggle to escape 
from the plane of the circle, protruding 
very slightly toward O in the central 
area of the grid; but the crossed pair 
of lines are firmly attached to the circle 
and no material amount of stereoscopic 
depth is permitted to develop by reason 
of the configurational attachment. Nor 
does the illusion ever become as small 
in degree as it does in the Héfler pattern. 

Unifications conditioned by physical- 
geometric disparity and by purely phe- 
nomenal disparity are equivalent. It is 
clear that the former type of horizontal 
disparation is not necessary for the con- 
ditioning of obligatory stereo-depth. 

The depth phenomena conditioned by 
the Zéilner and Poggendorf illusions are 
of the same order as exemplified by the 
Hoéfler pair, the configurational dynamics 
being of the same nature. In_ the 
Zéllner figures, however, the stereo- 
unification typically consumes an ap- 
preciable time for emergence and _per- 
fection; the two four-line sets encounter 
as a rule marked resistance to mergence 
by the crosshatched grid. Neverthe- 
less, the general configurational situation 
is fundamentally identical for the Zéllner 
and Héfler pairs; in both there is active 
as configurational agent a purely phe- 
nomenal horizontal disparity, but no 
geometrical-physical disparity whatso- 
ever. When, in the Zdllner pattern, the 
four lines-in-chief stand forth unified in 
stable fashion, completing the conquest 
over the perceptual claims of the cross- 
hatches, they are seen in strong stereo- 








STEREOPSIS 


depth and the illusion is destroyed. 
Stated tersely, stereoscopic depth may be 
regarded as the successful outcome of 
configurational struggle. 

The Poggendorf pair likewise illus- 
trates the fact that stereoscopic depth 
may be created out of a strictly phe- 
nomenal configurational disparity. But 
due to the fact that the Poggendorf 
illusion is a relatively weak one even 
under the most favorable conditions, the 
stereo-effect is not as strong as for the 
Héfler and Zéllner patterns. Never- 
theless, the depth effect is genuinely 
stereoscopic, without doubt. 

The proposition comes clearly down 
to this: solely phenomenal configura- 


tional disparity is sufficient to condition 
stereoscopic depth experience, the stimu- 
lation of disparate retinal elements not 
being necessary for the production of 
stereo-depth. 


SUMMARY 


Configurational disparation (whether phe- 
nomenal only or both phenomenal and geo- 
metric-physical), can operate as the sufficient 
condition for stereoscopic depth perception, but 
is not the universally necessary condition for the 
emergence thereof. Neither Gestalt disparation 
nor point disparation (in the Helmholtzian 
sense) is both necessary and sufficient with 
respect to conditioning stereoscopic experience. 


203 


To Lau belongs the credit for pioneer work on 
Gestalt disparation. He is essentially correct 
in his contention that some sort of configura 
tional theory is required for the adequate 
understanding of stereoscopic depth. We are 
unable to agree with Ogle that true stereoscopic 
depth must depend upon horizontally associated 
disparate retinal elements. 

Genuine stereoscopic depth experience 
emerges out of the struggle to unify two phe 
nomenally disparate configurations, even though 
the Gestalten cannot be referred back to any 
geometrical-physical disparation in the stimulus 
pattern: apparent horizontal disparation be- 
tween the configurations is sufficient. 


REFERENCES 


. Lau, E. Ueber das stereoskopische Sehen. 
Psychol. Forsch., 1922, 2, 1-5. 

. Lau, E. Ueber das stereoskopische Sehen. 
Psychol. Forsch., 1925, 6, 122-126 

. Ocre, K. N. Researches in binocular vision 
Philadelphia: W. B. Saunders, 1950. 

. Ocie, K. N. On stereoscopic depth per- 
ception. J. exp. Psychol., 1954, 48, 
225-233. 

. Tauscn, R. 
nehmung. 
3, 394-421. 

. Wipe, K. Der Punktreiheneffekt und die 
Rolle der binok. Querdisparation beim 
Tiefensehen. Psychol. Forsch., 1950, 23, 
223-262. 


Die beidaugige Raumwahr 
Z. angewand. Psychol., 1953, 


(Received August 10, 1955) 











Journal of Experimental Psycholo, 
Vol. 52 4a 3, 1956 — 


VIGILANCE IN THE DETECTION OF LOW-INTENSITY 
VISUAL STIMULI! 


JACK A. ADAMS? 
Operator Laboratory, Air Force Personnel and Training Research Center 


The topic of vigilant or attentive 
behavior® has been receiving increas- 
ing research emphasis in recent years. 
A prime motivating force for this 
research undoubtedly has been the 
importance of military jobs where 
vigilant behavior is mandatory for 
successful operator participation in 
certain weapons systems. Elicitation 
of vigilant behavior occurs when an 
otherwise steady stimulus state is 
aperiodically punctuated by stimulus 
change, and the task of the operator 
is to detect and report the occurrence 
of this change. An example of such 
a military task is the detection of a 
return on an air defense surveillance 
radar scope. Industrial jobs such as 
scanning assembly-line products for 
defects also require sustained visual 
attention. 

A number of investigators (2, 3, 5, 
6) have studied vigilant behavior 
with Bakan‘* being the most active 


! The experimental work for this study was 
performed as part of the United States Air Force 
Human Resources Research and Development 
Program. The opinions or conclusions con- 
tained in this report are those of the author. 
They are not to be construed as reflecting the 
views or indarsement of the Department of the 
Air Force. 

* Now at Field Unit No. 2, Operator Labora- 
tory, Tyndall Air Force Base, Florida. 

* The terms vigilance and attention will be 
considered synonymous and will be regarded as 
convenient labels for an observed class of be- 
havior. They will not be accorded intervening 
variable status. 

*Bakan, P. Preliminary tests of vigilance 
for verbal materials, Unpublished Memorandum 
Report B-2, Training Research Laboratory, 
Univer. of Illinois, 31 March 1952 (Mimeo); 
Vigilance decrement: A critical review of the 
literature and experimental program, Unpub- 





204 


and recent. In general, the findings 
have shown that detection proficiency 
is a decreasing function of observation 
time and that this decrement can be 
partially or completely eliminated by 
rest, knowledge of results, motivating 
instructions, and Benzedrine. A 
recent theoretical account of attention 
by Berlyne (1) employs the Hullian 
framework and would ascribe the 
decrement to the reactive inhibition 
concept Jp (4). Bakan also has used 
Ip as an explanatory device. Mack- 
worth (6) accounts for his results 
within a Pavlovian frame of reference. 

A class of vigilance variables re- 
ceiving relatively little research at- 
tention is stimulus factors affecting 
detection proficiency over a prolonged 
observation period. Mackworth (6) 
has investigated the role of stimulus 
intensity in the detection of small 
visual stimuli and found that over-all 
detection efficiency is much greater 
for high intensity stimuli than low 
intensity stimuli. The experiment 
to be presented here also evaluates 
the effect of visual stimulus intensity 
on detection proficiency and, in ad- 
dition, includes the variables stimulus 
duration and interpolated rest. Low 
intensity visual stimuli have been 
employed because Mackworth’s re- 
search (6) as well as common experi- 
ence suggests that visual stimuli 


lished Memorandum Report B-1l, Training 
Research Laboratory, Univer. of Illinois, 24 May 
1952 (Mimeo) ; and The change of threshold as a 
measure of decrement in a vigilance task, Un- 
published Memorandum Report B-S, Training 
Research Laboratory, Univer. of Illinois, 3 
October 1952 (Mimeo). 





VIGILANCE 


whose intensity is high relative to the 
background have a uniformly high de- 
tection level. The concern of this 


study was to research the stimulus 
brightness range where decremental 
phenomena as a function of observa- 
tion time could be expected to occur. 


MetTHopD 


Apparatus.—The device used in the investi- 
gation was the Vigilance Test and had a number 
of features in common with Mackworth’s 
Synthetic Radar Test (6). From S’s view- 
point, the display and response features of the 
task were simple. The S sat in front of a 5-in. 
diameter white screen in an air-conditioned 
room whose only light source was a 25-w. ceiling 
bulb. Mounted on the table in front of him 
was a spring-return toggle switch. His task 
was to flick the switch whenever he detected a 
blip of light on the screen. The interstimulus 
interval was unsystematic so S constantly had 
to attend tothescreen. Although the apparatus 
permitted presentation of the stimulus light in 
any one of 50 areas on the screen, only one light 
located in the center was used. Just this one 
light was used because it was felt desirable, for 
this experiment, to eliminate search factors. 
Thus, if S failed to detect a light, it would not 
be because he happened to be searching the 
wrong area of the screen at the time of stimulus 
occurrence. With search eliminated, perform- 
ance can be more meaningfully interpreted in 
terms of S’s vigilance. 

As for apparatus details, the stimulus light 
appeared on a smooth white screen made of fine- 
grain white paper sandwiched between two 
layers of thin glass. The stimulus light was a 
6-v. No. 47 G.E. lamp located behind the center 
of the screen and masked so that it appeared on 
the screen as a 2-mm. diameter circle with ill- 
defined edges. Its color was dull orange 
because less than 6 v. was used for each of the 
two stimulus brightness levels employed in this 
experiment. The voltage was 2.1 for the low 
brightness stimulus and 2.3 for the high bright- 
ness stimulus. The color temperature for 2.1 
and 2.3 v. was computed and found to be 1650° 
and 1700° Kelvin, respectively. To avoid ap- 
paratus cues that might reveal stimulus light 
occurrence, E’s control and response recording 
equipment were located in another part of the 
building. If S flicked the toggle switch within 
5 sec. after the onset of the stimulus light, it was 
considered detected and was recorded on a 
counter. Responses occurring any time after 
termination of the 5-sec. response interval and 


205 


before onset of the next stimulus light were 
recorded on a separate counter. 

A trial was defined as the presentation of 10 
stimuli over a 10-min. period and S’s score on a 
trial was number of stimuli detected out of 10. 
The interstimulus intervals ranged from .25 to 
2.00 min. and, for the 10 stimuli presented on a 
trial, were used in the following sequence: 1.00, 
50, .75, 1.75, .75, 2.00, .25, .50, 1.00, and 1.50 
min. Thus, the procedure was: sound buzzer 
to signal start of test, 1.00-min. interval, 
Stimulus No. 1, .50-min. interval, Stimulus No. 
2, .75-min. interval, Stimulus No.3 . . . Stimu- 
lus No. 9, 1.50-min. interval, Stimulus No. 10 
(Trial 1 ends here and sequence is repeated 
without buzzer for Trial 2), 1.00-min. interval, 
Stimulus No. 11, .50-min. interval, Stimulus 
No. 12, ete. 

Subjects —The Ss were 61 basic airmen 
trainees drawn from the population available 
at Lackland Air Force Base, Texas. These Ss 
were tested one at a time and were assigned at 
random to the four main groups. 

Procedure.—Following instructions and 20- 
min. adaptation to the prevailing low-level 
illumination in the experimental room, each S 
was given 11 trials (110 min. of continuous 
observation), 10 min. rest, and a final trial (No. 
12). The S was informed of the rest over an 
intercommunication system and was instructed 
to relax and pay no attention to the screen until 
he heard a buzzer. He did not leave the experi- 
mental room during the rest period. 

Four groups were differentiated on the basis 
of stimulus brightness and stimulus duration. 
The groups, their size, and their experimental 
conditions were: 


Stimulus 
Brightness Stimulus 
- Duration 
(Apparent 
Ft.-Candles) 
M9 
5 m9 
5 O16 
5 O16 


Group N 
A 16 
K 1 
1 
D 1 


Although both 019 and 016 apparent ft.- 
candles represent very low brightness levels, for 
convenience 019 apparent ft.-candles will be 
referred to as high brightness and .016 apparent 
ft.-candles will be termed low brightness. 
Brightness values of the stimuli were deter- 
mined with a Macbeth illuminometer. Read- 
ings were taken in a totally dark room with the 
illuminometer placed directly on the glass 
surface of the screen. Since testing ensued with 
a 25-w. ceiling bulb on, the brightness level of 
the screen under these conditions was measured 
from a distance of 5 in. and found to be .230 
apparent ft.-candles. Each of the foregoing 











206 


brightness values represents the mean of 40 
observations—20 by each of 2 Os. 

Instructions emphasized aperiodic occurrence 
of stimuli and the necessity for constant atten- 
tion, During the instruction period E gave S 
demonstrations with stimuli of the same bright- 
ness and duration to be used in subsequent 
testing. Demonstration continued until S 
reported seeing three stimuli. ‘To insure further 
that S was aware of the nature of the stimulus 
he was supposed to detect, an S was rejected 
who failed to detect at least three stimuli on 
Trial 1. One S was rejected for failing to meet 
this criterion. 

Cigarettes, matches, lighter, and watch were 
taken from S prior to testing. A peephole 
(unknown to S) allowed a frequent check by 
E to insure that S was not deviating from pre- 
scribed procedures. Deviations, such as shading 
the screen with the hands, elicited a briefly 
stated rerainder from E over the intercom- 
munications system, ‘These reminders were 
very infrequently required and can be regarded 
as a random factor. Although Mackworth (6) 
found that special instructions with motivating 
content increased the level of S vigilance, the 
brief reminder statements occurring in this 
study had no evident motivating quality and 
their rare and random occurrence would not be 
expected to influence average treatment effects. 


ReEsuLts 


Figure 1 shows the results of this 
study. For each group, mean number 
of stimuli detected is plotted against 
trials. Values for Trial 12 are sepa- 
rate from the others to indicate they 
were preceded by the 10-min. rest. 
Trials 1-11 represent 110 min. of 
continuous observation and it can be 
seen in Fig. 1 that all groups show a 
steady decline in detection proficiency 
over this period. Over-all perform- 
ance level is positively related to 
stirnulus brightness and duration with 
Groups B and D showing a general 
superiority over Groups A and C. 
Within the limits of the values used, 
stimulus duration seems to be a more 
important variable than stimulus 
brightness in determining proficiency 
since both Groups B and D had a 
longer stimulus presentation time 
than Groups A or C. 


JACK A. ADAMS 


TABLE 1 


Trenp Anatysis or Triats 1-11 
ror Att Groups 














Source o |\giem | FP | oP 
Trials 10| 21.30} 8.13 | <.01 
Groups 3 |482.67 | 20.33 | <.01 
Trials X Groups 30| 2.77) 1.06 05 
Pooled Between-Ss | 57 | 23.74 
Pooled Trials K Ss |570| 2.62 
Total 670 











A statistical evaluation of per- 
formance of the four groups over 
Trials 1-11 was accomplished by 
trend analysis. The results of this 
analysis are given in Table 1. To 
test significance of between-groups 
differences, the hypothesis that over- 
all mean performance level of the four 
groups was the same was tested by 
using the pooled Between-Ss mean 
square as the error term for evaluating 
the Groups mean square. The F 
ratio was significant beyond the 1% 
level and permitted a rejection of the 
hypothesis. To evaluate decreasing 
performance trend over Trials 1-11, 
the hypothesis that mean level for all 
groups combined was the same on all 
trials was tested by using the Pooled 
Trials X Ss mean square as error term 
for the Trials mean square. The F 
ratio was significant beyond the 1% 


8 0 
= . 
ws 9 <a oo ‘ 
a eo ta 
2 6 
“ o 
6 
te) 
4 
« 
; 3} croura »—« 
be 
2 —. 
z i 
5 








TRIALS 


Fie. 1. 


Mean performance curves 
for all groups. 








VIGILANCE 


TABLE 2 


Mean Nuniver or Fatse Responses ror Eacn Grovr on Eacu Triar 





Group 


level and allowed rejection of this 
hypothesis. The Pooled Trials X Ss 
mean square was also used to test 
significance of the Trials X Groups 
mean square. The failure of the 
obtained F ratio to be significant at 
the 5% level indicated that perform- 
ance trend of the four groups over 
Trials 1-11 could be regarded the 
same. 

All groups displayed a gain over the 
10-min. rest between Trials 11 and 12 
and the amount was about the same 
for all groups. The null hypothesis 
for each group’s mean gain was 
evaluated with the ¢ test for related 
measures. None of the four ¢ ratios 
obtained was significant at the 5% 
level with a consequent failure to 
reject the null hypothesis in each case. 
But, when gain measures for Ss in all 
groups were combined in a single t test, 
the mean gain of .85 was significant 
beyond the 1% level. 

Detection of a stimulus was con- 
sidered to have occurred if S re- 
sponded within 5 sec. after stimulus 
onset. All responses occurring be- 
tween the end of the 5-sec. response 
period and the onset of the succeeding 
stimulus were recorded separately. 
The mean number of these responses 
for each group on each trial is pre- 
sented in Table 2. These responses 
have been termed “false” in Table 2 
since they occurred in the absence of 


the stimulus light on the screen. 


False responses are most prominent 
with Group A and C, with Group C 
exhibiting the highest mean. For 
both Groups A and C, the mean 
number of these responses decreases 
with trials. Groups B and D had a 
negligible number of false responses. 


Discussion 


These data on vigilant behavior show 
that average detection level of an 
aperiodically presented low-intensity 
visual stimulus is a function of stimulus 
intensity, stimulus duration, and rest. 
Detection proficiency was a decreasing 
function of time for all 
conditions with rate of decrease inde- 
pendent of stimulus duration and in- 
tensity within the range studied. The 
common rate of decrease is probably of 
limited interest and generality. If 
stimuli of greater intensity and longer 
duration had also been included, there 
is little doubt that they would almost 
always have been detected and a common 
rate of decrease for all groups would not 
have then prevailed. 

Bakan (footnote 4) and Berlyne (1) 
have Hull's Je concept (4) to 
derive attention phenomena and this 
concept seems to have utility in ac- 
counting for the present data. Exercise 
of the visual effectors over long periods 
of watching leads to Jp accumulation 
with a consequent decrease in measured 
response level. In accordance with 
Hull’s definition of Jz, interpolated rest 
permits some or all of the inhibition to 
dissipate and performance gain results. 

Only a speculative explanation can be 


observation 


used 








208 


given for the false responses. One 
possibility is that they were mainly 
long latency responses occurring just 
outside the 5-sec. response period. Al- 
though latencies were not recorded, this 
possibility is not considered likely. The 
E observed that false responses almost 
always occurred in the longer inter- 
stimulus intervals of 1-2 min. Only 
rarely did a response occur immediately 
after termination of the 5-sec. response 
period and suggest a long latency re- 
sponse to a detected stimulus. 

It is noteworthy that false responses 
only occurred to any extent in Groups 
A and C where detection was most 
difficult. This suggests the hypothesis 
that, when the stimulus to be detected 
represents only a very slight or very brief 
change in the surround, and when S’s 
expectancy level is high in long inter- 
stimulus intervals, response can be rather 
readily triggered by transient, non- 
relevant stimuli that by chance may 
occur at the spot where the stimulus 
light always appears. But, as obser- 
vation time progresses and S has seen a 
considerable number of relevant stimuli, 
he learns to discriminate the stimulus 
light from nonrelevant cues. This dis- 
crimination learning could account for 
the decrease in false responses for Groups 
A and C. For Groups B and D, the 
stimulus light was sufficiently distinct 
from nonrelevant cues and no discrimi- 
nation learning was necessary. This 
hypothesis will have to be checked with 
research where these cues are system- 
atically manipulated by the use of 


various amounts of visual noise. 


JACK A. ADAMS 


SUMMARY 


Vigilant or attentive behavior was studied. 
The Vigilance Test was used to evaluate the 
ability to detect small, low-intensity, aperiodi- 
cally presented visual stimuli over a relatively 
long period of continuous observation. Each 
of four groups was presented stimuli at one of 
two brightness levels and one of two presentation 
times. The watching period was 110 min. A 
10-min. rest was then given and this was fol- 
lowed by another 10 min. of watching. 

The results were as follows: (a) Average 
number of stimuli detected was related to 
stimulus brightness and duration. (b) All 
groups showed a steady decline in proficiency 
over the 110-min. period. (c) All groups dis- 
played gain over rest. (d) The two groups 
having a short stimulus presentation time made 
a number of responses in the absence of the 
stimulus light. 


REFERENCES 


1, Bertyne, D. E. Attention, perception and 
behavior theory. Psychol. Rev., 1951, 58, 
137-146. 

2. Canpenter, A. The rate of blinking during 
prolonged visual search. J. exp. Psychol., 
1948, 38, 587-591. 

3. Fraser, D.C. The relation between angle 
of displays and performance in a pro- 
longed visual task. Quart. J. exp. Psy- 
chol., 1950, 2, 176-181. 

4. Huu, C. L. Principles of behavior. New 
York: D. Appleton-Century, 1943. 

5. Linpstey, D. B., et al. Radar operator 
“fatigue” : the effects of length and repeti- 
tion of operating periods on efficiency of 
performance. OSRD Rep., 1944, Rep. 
No. 3334. 

6. Macxwortu, N.H. Researches on the meas- 
urement of human performance. London: 
His Majesty’s Stationery Office, 1950. 
(Medical Res. Council Spec. Rep. Ser. 
No. 268.) 


(Received September 2, 1955) 












ara 


yu a 
aa 


Hiatt 


PERCEPTION 


@) 
©) 
(3) 


EXTRASENSORY 





